I am a developer on an open source project and I have been having some problems with the server thinking it has answered a socket completely (meaning it has either sent a reply or closed it's end in response to a failure) and the client being stuck in poll(). After some research, I found that close() doesn't always generate a POLLHUP event, but shutdown(sock, 2) does.
In light of that, I'm considering adding a shutdown(sock,2) in the event of error handling (in addition to the close() call). Does anyone know of some reasons that this would cause problems? Am I barking up the wrong tree? I'm thinking that if the server believes that the socket is closed, the client should definitely not attempt anything else with that socket, and I can't think of a reason not to add this, but I haven't been working with tcp connections for that long and would love some advice.
You need to figure out why closeing the socket isn't causing it to shutdown. The most likely reason is that there is another descriptor that accesses the same endpoint. Only closeing the last endpoint causes an implicit shutdown.
Do you ever dup the file descriptor? Do you make sure it is closed in all child processes? If the socket was in a parent process before it forked this process, did the parent close their copy?
POLLHUP is not the right way to test for a closed connection. You should be testing for the file descriptor becoming readable and subsequently returning a zero-length read. This is the definition of end-of-file.
Related
I have a server that is running a select() loop that sometimes continues blocking when the client closes the connection from its side. The select() loop handles all other read/write operations correctly and sets the correct file descriptor in the fd_set, leading me to believe that it is not an issue with the file descriptor setup on the server-side.
The way I planned on handling the client closing the connection was to have the select() break due to activity on the socket (closing it from the client-side), see that the fd was set for that socket, and then try to read from it - and if the read returned 0, then close the connection. However, because the select() doesn't always return when the client side closes the connection, there is no attempt to check the fd_set and subsequently try to read from the socket.
As a workaround, I implemented a "stop code" that the client writes to the server just before closing the connection, and this write causes the select() to break and the server reads the "stop code" and knows to close the socket. The only problem with this solution is the "stop code" is an arbitrary string of bytes that could potentially appear in regular traffic, as the normal data being written can contain random strings that could potentially contain the "stop code". Is there a better way to handle the client closing the connection from its end? Or is the method I described the general "best practice"?
I think my issue has something to do with OpenSSL, as the connection in question is an OpenSSL tunnel, and it is the only file descriptor in the set giving me issues.
The way I planned on handling the client closing the connection was to have the select() break due to activity on the socket (closing it from the client-side), see that the fd was set for that socket, and then try to read from it - and if the read returned 0, then close the connection. However, because the select() doesn't always return when the client side closes the connection, there is no attempt to check the fd_set and subsequently try to read from the socket.
Regardless of whether you are using SSL or not, select() can tell you when the socket is readable (has data available to read), and a graceful closure is a readable condition (a subsequent read operation reports 0 bytes read). It is only abnormal disconnects that select() can't report (unless you use the exceptfds parameter, but even that is not always guaranteed). The best way to handle abnormal disconnects is to simply use timeouts in your own code. If you don't receive data from the client for awhile, just close the connection. The client will have to send data periodically, such as a small heartbeat command, if it wants to stay connected.
Also, when using OpenSSL, if you are using the older ssl_... API functions (ssl_new(), ssl_set_fd(), ssl_read(), ssl_write(), etc), make sure you are NOT just blindly calling select() whenever you want, that you call it ONLY when OpenSSL tells you to (when an SSL read/write operation reports an SSL_ERROR_WANT_(READ|WRITE) error). This is an area where alot of OpenSSL newbies tend to make the same mistake. They try to use OpenSSL on top of pre-existing socket logic that waits for a readable notification before then reading data. This is the wrong way to use the ssl_... API. You are expected to ask OpenSSL to perform a read/write operation unconditionally, and then if it needs to wait for new data to arrive, or pending data to send out, it will tell you and you can then call select() accordingly before retrying the SSL read/write operation again.
On the other hand, if you are using the newer bio_... API functions (bio_new(), bio_read(), bio_write(), etc), you can take control of the underlying socket I/O and not let OpenSSL manage it for you, thus you can do whatever you want with select() (or any other socket API you want).
As a workaround, I implemented a "stop code" that the client writes to the server just before closing the connection, and this write causes the select() to break and the server reads the "stop code" and knows to close the socket.
That is a very common approach in many Internet protocols, regardless of whether SSL is used or not. It is a very distinct and explicit way for the client to say "I'm done" and both parties can then close their respective sockets.
The only problem with this solution is the "stop code" is an arbitrary string of bytes that could potentially appear in regular traffic, as the normal data being written can contain random strings that could potentially contain the "stop code".
Then either your communication protocol is not designed properly, or your code is not processing the protocol correctly. In a properly-designed and correctly-processed protocol, there will not be any such ambiguity. There needs to be a clear distinction between the various commands that your protocol defines. Your "stop code" would be one such command amongst other commands. Random data in one command should not be mistakenly treated as a different command. If you are experiencing that problem, you need to fix it.
I get this error every time my program reaches a write() function. The program will continue again, but will stop on the next write() call. When I run this program outside of gdb, it runs properly.
Program received signal SIGPIPE, Broken pipe.
0x00007ffff794b340 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
I've been told that this happens when the socket is closed from the remote end, but how would that be happening.
Note: The server and client are both running on the same machine, and the server was prebuilt for me, so I don't have access to it's code.
SIGPIPE is generated when the other side closed the connection. And there are good reasons for its existence.
By default gdb catches SIGPIPE.
If you aren't interested, and chances are you don't, simply disable it:
handle SIGPIPE nostop noprint pass
I've been told that this happens when the socket is closed from the remote end, but how would that be happening.
You mean why? Since you don't have the source we can only guess.
Perhaps it already sent all the data it wanted and closed the connection, because there's no point keeping it open... Remember, connections can be half-closed (that is, from one side). The server doesn't want to read any further, and just waits you to read the data and close your side. Probably nothing went wrong - but you have to decide that yourself, as only you know what the application protocol is.
I have a server-client system..where each clients mmap the file found on the server. As soon as a client updates the file, the server needs to notify the clients to update their file..i.e. they should unmap and mmap the file again. I thought that a solution to this problem is to send a string "Update" to the client by using write() (in the server side)..and do an infinite while loop to continue waiting for such "Update" by using read() (in the client side). However, this while loop should be in some sort of thread or child process. Which is best? and any other suggestions please? Much appreciated. Thanks in advance.
Look into using sockets and the select statement. With a setup like this you can make event based programming
The server could send a signal which the clients would trap and act accordingly.
Just take care of what your signal handler will do (there are many functions that are not safe to call in the context of signal handling).
Also be aware of race conditions and careful to not lose signals.
I want to be able to stop listening on a server socket in linux and ensure that all connections that are open from a client's point of view are correctly handled and not abruptly closed (ie: receive ECONNRESET).
ie:
sock = create_socket();
listen(sock, non_zero_backlog);
graceful_close(sock);
if thought calling close() and handling already accept'd sockets would be enough but there can be connections that are open in the kernel backlog which will be abruptly closed if you call close() on the server socket.
The only working way to do that (that I have found) is to:
prevent accept() from adding more clients
have a list of the open sockets somewhere and to wait until they are all properly closed which means:
using shutdown() to tell the client that you will no longer work on that socket
call read() for a while to make sure that all the client has sent in
the meantime has been pulled
then using close() to free each client socket.
THEN, you can safely close() the listening socket.
You can (and should) use a timeout to make sure that idle connections will not last forever.
You are looking at a limitation of the TCP socket API. You can look at ECONNRESET as the socket version of EOF or, you can implement a higher level protocol over TCP which informs the client of an impending disconnection.
However, if you attempt the latter alternative, be aware of the intractable Two Armies Problem which makes graceful shutdown impossible in the general case; this is part of the motivation for the TCP connection reset mechanism as it stands. Even if you could write graceful_close() in a way that worked most of the time, you'd probably still have to deal with ECONNRESET unless the server process can wait forever to receive a graceful_close_ack from the client.
Is it possible for me to accept a connection and have it die withouit my knowing, then accept another connection on the same socket number?
I've got a thread to do protocol parsing and response creation. I've got another thread to handle all my network IO and one more thread to handle new incomcing connection requests. That makes three threads total. Using select in the IO thread, I get a failure and have to search for the dead socket. I am afraid there is the case that accept might want to accept a new connection on a socket number that was previous dead.
I'd assume this can't happen until I "shutdown() || close();" the socket that may be dead on the server side. If it could happen, is the only solution to setup mutexes to halt everything while I sort out what sockets have gone bonkers?
Thanks,
Chenz
A socket descriptor wont get reused until you close it.
Assuming we're talking TCP, then if the remote side closes its send side of the connection then you'll get a recv() returning 0 bytes to tell you of this. Since TCP support half closed connections you could still be able to send data to the remote side of the connection (if your application level protocol is made that way) or you might take the fact that the remote side has closed its send side as an indication that you should do the same.
You use shutdown() to close either your send side or your recv side or both sides of the connection. You use close() to close the socket and release the handle/descriptor for reuse.
So, in answer to your question. No, you wont be able to accept another connection with the same socket descriptor until you call close() on the descriptor that you already have.
You MAY accept a connection on a new socket descriptor; but that's probably not a problem for you.