Graceful Shutdown Server Socket in Linux - c

I want to be able to stop listening on a server socket in linux and ensure that all connections that are open from a client's point of view are correctly handled and not abruptly closed (ie: receive ECONNRESET).
ie:
sock = create_socket();
listen(sock, non_zero_backlog);
graceful_close(sock);
if thought calling close() and handling already accept'd sockets would be enough but there can be connections that are open in the kernel backlog which will be abruptly closed if you call close() on the server socket.

The only working way to do that (that I have found) is to:
prevent accept() from adding more clients
have a list of the open sockets somewhere and to wait until they are all properly closed which means:
using shutdown() to tell the client that you will no longer work on that socket
call read() for a while to make sure that all the client has sent in
the meantime has been pulled
then using close() to free each client socket.
THEN, you can safely close() the listening socket.
You can (and should) use a timeout to make sure that idle connections will not last forever.

You are looking at a limitation of the TCP socket API. You can look at ECONNRESET as the socket version of EOF or, you can implement a higher level protocol over TCP which informs the client of an impending disconnection.
However, if you attempt the latter alternative, be aware of the intractable Two Armies Problem which makes graceful shutdown impossible in the general case; this is part of the motivation for the TCP connection reset mechanism as it stands. Even if you could write graceful_close() in a way that worked most of the time, you'd probably still have to deal with ECONNRESET unless the server process can wait forever to receive a graceful_close_ack from the client.

Related

TCP: What happens when client connects, sends data and disconnects before accept

I'm testing some code in C and I've found strange behaviour with TCP socket calls.
I've defined one listening thread which accepts clients synchronously and after accepting the client it process it in a for loop until it disconnects. Thus only one client at a time is handled. So I call accept in a loop and then recv in an inner loop until received an empty buffer.
I fire 5 threads with clients, I call connect, send and finally close
I get no error in any call. Everything seems to be fine.
However when I print received message on the server side it turns out that only the first client got through to the server, i.e. accept never fires on other clients.
So my questions are:
Shouldn't connect wait until server calls accept? Or is the kernel layer taking care of buffering under the hood?
If it's not the case then shouldn't the server be able to accept the socket anyway, even if it is in a disconnected state? I mean is it expected to lose all the incoming data?
Or should I assume that there's a bug in my code?
The TCP state-machine performss a synchronized dance with the client's state machine. All of this is performed at OS-level (The TCP/IP stack); the userspace process only can do some systemcalls to influence this machinery now and then. Once the client calls listen() this machinery is started; and new connections will be establisched.
Remember the second argument for listen(int fd, int backlog) ? The whole 3way handshake is completed (by the TCP stack) before accept() delivers the fd to the server in userland. So: the sockets are in connected state, but the user process hasn't picked them up yet (by calling accept() )
Not calling accept() will cause the new connections to be queued up by the kernel. These connections are fully functional, but obviously the data buffers could fill up and the connection would get throttled.
Suggested reading: Comer& Stevens: Internetworking with TCP/IP 10.6-10.7 (containing the TCP state diagram)

How to terminate non-blocking socket connection attempt?

A typical answer to the question of how to put time limit on connection attempt when using sockets is this:
1) make socket non-blocking,
2) call connect(),
3) use select() to see if connection is successful.
What is not clear to me at the moment is how to terminate connection attempt after certain amount of time if connection cannot be established. As far as I understand OS will continue trying to establish connection even after select() returns (providing select() timeout is smaller than OS timeout).
Is this correct? If so, how can I stop this process? Is switching socket back to blocking sufficient? Are there any other options except closing a socket? Thanks.
Just close the socket. It isn't any further use to you if you've decided the connect is taking too long. The OS will stop trying, release the resources, etc., everything you want.

Does connect() block for TCP socket?

Hi I am reading TLPI (The Linux Programming Interface), I have a question about connect().
As I understand, connect() will immediately return if the pending connection numbers of listen() doesn't reach "backlog".
And it will blocks otherwise. (according to figure 56-2)
But for TCP socket, it will always block until accept() on server side is called (according to figure 61-5).
Am I correct?
Because I saw that in the example code (p.1265), it calls listen() to listen to a specific port and then calls connect() to that port BEFORE calling accept().
So connect() blocks forever in this case, doesn't it?
Thanks!!
There's hardly any "immediately" regarding networking, stuff can be lost on the way, and an operation that should be performed immediately in theory might not do so in practice, and in any case there's the end to end transmission time.
However
connect() on a TCP socket is a blocking operation unless the socket descriptor is put into non-blocking mode.
The OS takes care of the TCP handshake, when the handshake is finished, connect() returns. (that is,
connect() does not block until the other end calls accept())
A successful TCP handshake will be queued to the server application, and can be accept()'ed any time later.
connect() blocks until finishing TCP 3-way handshake. Handshake on listening side is handled by TCP/IP stack in kernel and finished without notifying user process. Only after handshake is completed (and initiator can return from connect() call already), accept() in user process can pick up new socket and return. No waiting accept() needed for completing handshake.
The reason is simple: if you have single threaded process listening for connections and require waiting accept() for establishing connections, you can't respond to TCP SYN's while processing another request. TCP stack on initating side will retransmit, but on moderately loaded server chances are high this retransmitted packet still will arrive while no accept() pending and will be dropped again, resulting in ugly delays and connection timeouts.
connect is a blocking call by default, but you can make it non blocking by passing to socket the SOCK_NONBLOCK flag.

Is it possible for me to accept a connection and have it die withouit my knowing, then accept antoher connection on the same socket number?

Is it possible for me to accept a connection and have it die withouit my knowing, then accept another connection on the same socket number?
I've got a thread to do protocol parsing and response creation. I've got another thread to handle all my network IO and one more thread to handle new incomcing connection requests. That makes three threads total. Using select in the IO thread, I get a failure and have to search for the dead socket. I am afraid there is the case that accept might want to accept a new connection on a socket number that was previous dead.
I'd assume this can't happen until I "shutdown() || close();" the socket that may be dead on the server side. If it could happen, is the only solution to setup mutexes to halt everything while I sort out what sockets have gone bonkers?
Thanks,
Chenz
A socket descriptor wont get reused until you close it.
Assuming we're talking TCP, then if the remote side closes its send side of the connection then you'll get a recv() returning 0 bytes to tell you of this. Since TCP support half closed connections you could still be able to send data to the remote side of the connection (if your application level protocol is made that way) or you might take the fact that the remote side has closed its send side as an indication that you should do the same.
You use shutdown() to close either your send side or your recv side or both sides of the connection. You use close() to close the socket and release the handle/descriptor for reuse.
So, in answer to your question. No, you wont be able to accept another connection with the same socket descriptor until you call close() on the descriptor that you already have.
You MAY accept a connection on a new socket descriptor; but that's probably not a problem for you.

Using SO_REUSEADDR - What happens to previously open socket?

In network programming in unix, I have always set the SO_REUSEADDR option on the socket being used by server to listen to connections on. This basically says that another socket can be opened on the same port on the machine. This is useful when recovering from a crash and the socket was not properly closed - the app can be restarted and it will simply open another socket on the same port and continue listening.
My question is, what happens to the old socket? Without a doubt, all data/connections will still be received on the old socket. Does it get closed automatically by the OS?
A socket is considered closed when the program that was using it dies. That much is handled by the OS, and the OS will refuse to accept any further communication from the dead conversation. However, if the socket was closed unexpectedly, the computer on the other end might not know that the conversation is over, and may still be attempting to communicate.
That is why there is, designed into the TCP spec, a waiting period before that same port number can be reused. Because in theory, however unlikely, it may be possible for a packet from the old conversation to arrive with the appropriate IP address, port numbers, and sequence numbers such that the receiving server mistakenly inserts it into the wrong TCP stream by accident.
The SO_REUSEADDR option overrides that behavior, allowing you to reuse the port immediately. Effectively, you're saying: "I understand the risks and would like to use the port anyway."
Yes, the OS automatically closes the previous socket when the old process ends. The reason you can't normally listen on the same port right away is because the socket, though closed, remains in the 2MSL state for some amount of time (generally a few minutes). The OS automatically transitions the old socket out of this state when the timeout expires.

Resources