We have a server with a limited on the number of incoming connections it can accept.
We have multiple clients connecting to the server at various intervals, for various different reasons.
At least one of the functions of the server requires it to process the client's request and reply back on the same socket. However:
the client complains about timing out (and I believe closes the socket)
the server finishes it's processing successfully, but the thread throws a SIGCHLD because the socket has been closed.
I have code similar to the one below, that checks the socket descriptor.
if (connect_desc > 0)
{
if (write(connect_desc, buffer, sizeof(buffer)) < 0)
{
printf("write error\n");
}
}
else
printf("connect_desc < 0\n");
My question is:
If the socket is closed by the client, would the socket descriptor change in value on the server? If not, is there any way to catch that in my code?
I'm not seeing that last print out.
Q: Will the descriptor change?
A: No
Q: How can I check the status of my connection?
A: One way is simply to try writing to the socket, and check the error status.
STRONG RECOMMENDATION:
Beej's Guide to Network Programming
Q. Will the file descriptor change?
Not unless:
It is documented somewhere you can cite.
The operating system magically knows about the disconnection.
The operating system magically knows where in your application the FD is stored, including all the copies.
The operating system wants to magically make it impossible for you to close the socket yourself.
None of these is true. The question doesn't even make sense.
Q. How can I check the status of my connection.
There isn't, by design, any such thing as the status of a TCP connection. The only way you can detect whether it has failed is by trying to use it.
Related
I have a TCP socket in blocking mode being used for the client side of a request/response protocol. Sometimes I am finding that if a socket was unused for a minute or two a send call succeeds and indicates all bytes sent, but the following recv returns zero, indicating a shutdown. I have seen this on both Windows and Linux clients.
The server guys tell me they always send some response before shutdown if they had received data, but they may close a socket that has not yet received anything if low on server resources.
Is what I am seeing indicative of the server having closed the connection while I was not using it, and then why does send then succeed?
What is the correct way automatically detect this such that the request is resent on a new connection in this case, but bearing in mind that if the server actually received some requests twice could have unintended effects?
//not full code (buffer management, wrapper functions, etc...)
//no special flags/options are being set, just socket then connect
sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
connect(sock, addr, addrlen);
//some time later after many requests/responses, normally if was inactive for a minute
//sending about 50 bytes for requests, never actually seen it loop, or return 0
while (more_to_send) check(send(sock, buffer, len, 0));
//the very first recv returns 0, never seen it happen part way through a response (few KB to a couple of MB)
while (response_not_complete) check(recv(sock, buffer, 4096, 0));
If you don't get an application acknowledgment of the request from the server, re-send it.
Design your transactions to be idempotent so that re-sending them doesn't cause ill-effects.
Is what I am seeing indicative of the server having closed the
connection while I was not using it
Yes.
, and then why does send then succeed?
send()'s succeeding tells you only that some (or all) of the data you passed into send() has been successfully copied into an in-kernel buffer, and that from now on it is the OS's responsibility to try to deliver those bytes to the remote peer.
In particular, it does not indicate that those bytes have actually gone across the network (yet) or been successfully received by the server.
What is the correct way automatically detect this such that the
request is resent on a new connection in this case, but bearing in
mind that if the server actually received some requests twice could
have unintended effects?
As EJP suggests, the best way would be to design your communications protocol such that sending the same request twice has no effect that is different from sending it once. One way to do that would be to add a unique ID to each message you send, and add some logic to the server such that if it receives a message with an ID that is the same as one that it has already processed, it discards the message as a duplicate.
Having the server send back an explicit response to each message (so that you can know for sure your message got through and was processed) might help, but of course then you have to start worrying about the case where your message was received and processed but then the TCP connection broke before the response could be delivered back to you, and so on.
One other thing you could do (if you're not doing it already) is to monitor the state of the TCP socket (via select(), poll(), or similar) so that your program will be immediately notified (by the socket select()-ing as ready-for-read) when the remote peer closes its end of the socket. That way you can deal with the closed TCP connection well before you try to send() a command, rather than only finding out about it afterwards, and that should be a less awkward situation to handle, since in that case there is no question about whether a command "got through" or not.
I'm attempting to write a rudimentary file server that takes a filename from a client and responds by sending the data over TCP to the client. I have a working client and server application for the most part but I'm observing some odd behavior, consider the following
while ((num_read = read (file_fd, file_buffer, sizeof (file_buffer))) > 0)
{
if (num_read != write (conn_fd, article_buffer, num_read))
{
perror ("write");
goto out;
}
}
out:
close(file_fd); close(sub_fd);
file_fd is a file descriptor to the file being sent over the network, conn_fd is a file descriptor to a connect()ed TCP socket.
This seems to work for small files, but when my files get larger(megabyte+) it seems that some non-consistent amount of data at the end of the file will fail to transfer.
I suspected the immediate close() statements after write might have something to do with it so I tried a 1 second sleep() before both close() statements and my client successfully received all of the data.
Is there any better way to handle this than doing a sleep() on the server side?
A successful "write" on a socket does not mean the data has been successfully sent to the peer.
If you are on a unix deriviative, you can perform a "man 7 socket" and examine SO_LINGER" as a potential solution.
edit: Due to EJP's comment (thank you), I reread what Stevens has to say about the subject in "Unix Network Programming" of ensured delivery of all data to a peer. He says the following (in Volume 1 of Second edition, page 189):
... we see that when we close our end of the connection, depending on the function called (close or shutdown) and whether he SO_LINGER socket option is set, the return can occur at three differrent times.
close returns immediately, without waiting at all (the defaults; Figure 7.6)
close lingers until the ACK of our FIN is received (Figure 7.7), or
shutdown followed by a read waits until we receive the peer's FIN (Figure 7.8)
His figures, and his commentary, indicate other than "application level acknowledgement", the combination of shutdown(), followed by a read() waiting for a zero return code (i.e. notification that the socket has been closed), is the only way to ensure the client application has received the data.
If, however, it is only important that the data has been successfully delivered (and acknowledged) the the peer's computer, then SO_LINGER would be sufficient.
As the Title already says im looking for a way, to get notified when a client closes his Session unnormal.
I'm using the freeBSD OS.
The server is running with Xamount threads (depending on CPUcore amount). So I'm not forking, and there isn't a own process for each client.
That's why sending an deathpackage all time_t seconds, to recive a SIGPIPE isn't an option for me.
But i need to remove left clients from the kqueue, because otherwise after too many accept()'s my code will obviously run into memory troubles.
Is there a way, I can check without high performance loose per client, they are connected or not?
Or any event-notification, that would trigger if this happens? Or maybe is there a way of letting a programm send any signal to a port, even in abnormal termination case, before the Client process will exite?
Edit: that answer misses the question, because it's not about using kqueue. But if someone else finds the question by the title, it may be helpful anyway ...
I've often seen the following behaviour: if a client dies, and the server does a select() on the client's socket descriptor, select() returns with return code > 0 and FD_ISSET( fd ) will be true for that descriptor. But when you now try to read form the socket, read() (or recv()) return ERROR.
For a 'normal' connection using that to detect a client's death works fine for us, but there seems to be a different behaviour when the socket connection is tunneled but we haven't yet managed to figure that out completely.
According to the kqueue man page, kevent() should create an event when the socket has shutdown. From the description of th filter EVFILT_READ:
EVFILT_READ
Takes a descriptor as the identifier, and returns whenever there is data available to read. The behavior of the filter is slightly different depending on the descriptor type.
Sockets
Sockets which have previously been passed to listen() return when there is an incoming connection pending. data contains the size of the listen backlog.
Other socket descriptors return when there is data to be read, subject to the SO_RCVLOWAT value of the socket buffer. This may be overridden with a per-filter low water mark at the time the filter is added by setting the NOTE_LOWAT flag in fflags, and specifying the new low water mark in data. On return, data contains the number of bytes of protocol data available to read.
If the read direction of the socket has shutdown, then the filter also sets EV_EOF in flags, and returns the socket error (if any) in fflags. It is possible for EOF to be returned (indicating the connection is gone) while there is still data pending in the socket
buffer.
In the client, I have a
close(sockfd)
where sockfd is the socket that's connected to the server.
In the server I've got this:
if (sockfd.revents & POLLERR ||
desc_set[i].revents & POLLHUP || desc_set[i].revents & POLLNVAL) {
close(sockfd.fd);
printf("Goodbye (connection closed)\n");
}
Where sockfd is a struct pollfd, and sockfd.fd is the file descriptor of the client's socket.
When the client closes the socket like I put up there, the server doesn't seem to detect it with the second code (desc_set[i].revents & POLLHUP, etc.).
Does anyone know what's the problem?
Sounds like you've managed to half close the connection from the client side. In this state the connection can still send data in one direction, i.e. it operates in half-duplex mode. This is by design and would allow your server to finish replying to whatever the client sent. Typically this would mean completing a file transfer and calling close(), or answering all of the aspects of the query. In the half-closed state you can still quite sensibly send data to the side that has already called close(). In your server you will see eof if you try to read though. close() just means "I'm done sending, finish up whatever I asked for".
POLLHUP, POLLERR and POLLNVAL only checks the output side of the local connection, which is still valid here. There's a POLLRDHUP, which is a GNU extension that should detect the other side closing, but the tests you're doing are only checking if it's still writable, not if it's still readable.
See also this question, which is talking about java, but still very related.
A remote close or output shutdown is neither an error nor a hangup nor an invalid state. It is a read event such that read() will return zero. Just handle it as part of your normal read processing.
BTW your test condition above should read sockfd.revents & (POLLERR|POLLHUP|POLLNVAL).
In my client code, I am following these steps to connect to a socket:
Creating a socket
sockDesc = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)
Connecting it (retry for 'x' time in case of failure)
connect(sockDesc, (sockaddr *) &destAddr, sizeof(destAddr))
(After filling the destAddr fields)
Using the socket for send()/recv() operation:
send(sockDesc, buffer, bufferLen, 0)
recv(sockDesc, buffer, bufferLen, 0)
close() the socket descriptor and exit
close(sockDesc)
If during send()/recv() the connection breaks, I found that I could connect by returning to step 2.
Is this solution okay? should I close the socket descriptor and return to step 1?
Another interesting observation that I am not able to understand is when
I stop my echo server and start the client. I create a Socket (step 1) and call connect() which fails (as expected) but then I keep calling connect(), lets say, 10 times. After 5 retries I start the server and connect() is successful. But during the send() call it receives SIGPIPE error. I would like to know:
1) Do I need to create a new socket every time connect() fails? As per my understanding as long as I have not performed any send()/recv() on the socket it is as good as new and I can reuse the same fd for the connect() call.
2) I don't understand why SIGPIPE is received when the server is up and connect() is successful.
Yes, you should close and go back to step 1:
close() closes a file descriptor,
so that it no longer refers to any
file and may be reused.
From here.
I think closing the socket is the right thing to do, despite the fact that it may work if you don't.
A socket which has failed to connect may not be in EXACTLY the same state as a brand new one - which could cause problems later. I'd rather avoid the possibility and just make a new one. It's cleaner.
TCP sockets hold a LOT of state, some of which is implementation-specific and worked out from the network.
Sockets corresponding to broken connection is in unstable state. normally you will not be allowed to connect to again unless the operating system release the socket.
I think it will be better to close() and connect again.. you don't have to create another socket.
Anyway, make sure to set LINGER of your socket to ensure that no data is lost in transmision.
See http://www.gnu.org/s/libc/manual/html_node/Socket_002dLevel-Options.html#Socket_002dLevel-Options
If the connection was broken and you try to write on the file descriptor you should get the broken pipe error/signal. All this is saying is that the file descriptor you tried writing to no longer has anyone on the other side to read what you are sending.
What you can do is catch the signal SIGPIPE and then deal with the reconnecting by closing the FD and going back to your step 1. You will now have a new FD you can read and write from for the connection.
If the Single UNIX Specification doesn't say that it MUST work to go back to step #2 instead of step #1, then the fact that it happens to work on Linux is just an implementation detail, and you would be far better off and more portable if you go back to step #1. As far as I am aware, the specification does not make any guarantee that it is ok to go back to step #2 and, therefore, I would advise you to go back to step #1.