When send()/recv() fails. Where does the socket stands? - c

Suppose I have a socket which I have created by the socket() system call. After that I did a connect() and started sending and receiving data.
Similarly, on the other side, a socket was created with the socket() system call, and then bind(), listen() and accept() were called. Then this side also started communicating.
Now lets suppose one of the send() (and accordingly recv() on the other side) fails.
What I want to where would that socket stand after the fail?
To communicate again, should I create the socket again and do connect() (bind(), listen() and accept() on the other side) or I can try send() and recv() again? And additionally, what is the best thing to be done in such a scenario?
NOTE: We do not know what is the reason for the send()/recv() to fail. It can be anything from a physical wire break to the other side refusing (maybe using iptables).

What to do depends entirely on why send or recv failed.
When an error is detected, check the value of errno to determine the reason. For example, if the error code is EAGAIN you would attempt the operation again, and if the error code is ECONNRESET you would need to reconnect.
See the man pages for send and recv for more details on what errors may be returned and what you should do about them.

Related

TCP: What happens when client connects, sends data and disconnects before accept

I'm testing some code in C and I've found strange behaviour with TCP socket calls.
I've defined one listening thread which accepts clients synchronously and after accepting the client it process it in a for loop until it disconnects. Thus only one client at a time is handled. So I call accept in a loop and then recv in an inner loop until received an empty buffer.
I fire 5 threads with clients, I call connect, send and finally close
I get no error in any call. Everything seems to be fine.
However when I print received message on the server side it turns out that only the first client got through to the server, i.e. accept never fires on other clients.
So my questions are:
Shouldn't connect wait until server calls accept? Or is the kernel layer taking care of buffering under the hood?
If it's not the case then shouldn't the server be able to accept the socket anyway, even if it is in a disconnected state? I mean is it expected to lose all the incoming data?
Or should I assume that there's a bug in my code?
The TCP state-machine performss a synchronized dance with the client's state machine. All of this is performed at OS-level (The TCP/IP stack); the userspace process only can do some systemcalls to influence this machinery now and then. Once the client calls listen() this machinery is started; and new connections will be establisched.
Remember the second argument for listen(int fd, int backlog) ? The whole 3way handshake is completed (by the TCP stack) before accept() delivers the fd to the server in userland. So: the sockets are in connected state, but the user process hasn't picked them up yet (by calling accept() )
Not calling accept() will cause the new connections to be queued up by the kernel. These connections are fully functional, but obviously the data buffers could fill up and the connection would get throttled.
Suggested reading: Comer& Stevens: Internetworking with TCP/IP 10.6-10.7 (containing the TCP state diagram)

Graceful Shutdown Server Socket in Linux

I want to be able to stop listening on a server socket in linux and ensure that all connections that are open from a client's point of view are correctly handled and not abruptly closed (ie: receive ECONNRESET).
ie:
sock = create_socket();
listen(sock, non_zero_backlog);
graceful_close(sock);
if thought calling close() and handling already accept'd sockets would be enough but there can be connections that are open in the kernel backlog which will be abruptly closed if you call close() on the server socket.
The only working way to do that (that I have found) is to:
prevent accept() from adding more clients
have a list of the open sockets somewhere and to wait until they are all properly closed which means:
using shutdown() to tell the client that you will no longer work on that socket
call read() for a while to make sure that all the client has sent in
the meantime has been pulled
then using close() to free each client socket.
THEN, you can safely close() the listening socket.
You can (and should) use a timeout to make sure that idle connections will not last forever.
You are looking at a limitation of the TCP socket API. You can look at ECONNRESET as the socket version of EOF or, you can implement a higher level protocol over TCP which informs the client of an impending disconnection.
However, if you attempt the latter alternative, be aware of the intractable Two Armies Problem which makes graceful shutdown impossible in the general case; this is part of the motivation for the TCP connection reset mechanism as it stands. Even if you could write graceful_close() in a way that worked most of the time, you'd probably still have to deal with ECONNRESET unless the server process can wait forever to receive a graceful_close_ack from the client.

Does connect() block for TCP socket?

Hi I am reading TLPI (The Linux Programming Interface), I have a question about connect().
As I understand, connect() will immediately return if the pending connection numbers of listen() doesn't reach "backlog".
And it will blocks otherwise. (according to figure 56-2)
But for TCP socket, it will always block until accept() on server side is called (according to figure 61-5).
Am I correct?
Because I saw that in the example code (p.1265), it calls listen() to listen to a specific port and then calls connect() to that port BEFORE calling accept().
So connect() blocks forever in this case, doesn't it?
Thanks!!
There's hardly any "immediately" regarding networking, stuff can be lost on the way, and an operation that should be performed immediately in theory might not do so in practice, and in any case there's the end to end transmission time.
However
connect() on a TCP socket is a blocking operation unless the socket descriptor is put into non-blocking mode.
The OS takes care of the TCP handshake, when the handshake is finished, connect() returns. (that is,
connect() does not block until the other end calls accept())
A successful TCP handshake will be queued to the server application, and can be accept()'ed any time later.
connect() blocks until finishing TCP 3-way handshake. Handshake on listening side is handled by TCP/IP stack in kernel and finished without notifying user process. Only after handshake is completed (and initiator can return from connect() call already), accept() in user process can pick up new socket and return. No waiting accept() needed for completing handshake.
The reason is simple: if you have single threaded process listening for connections and require waiting accept() for establishing connections, you can't respond to TCP SYN's while processing another request. TCP stack on initating side will retransmit, but on moderately loaded server chances are high this retransmitted packet still will arrive while no accept() pending and will be dropped again, resulting in ugly delays and connection timeouts.
connect is a blocking call by default, but you can make it non blocking by passing to socket the SOCK_NONBLOCK flag.

shutdown(2) system call does not work for me; I am not sure what am I doing wrong

I am experimenting with shutdown(2) system call.
According to the manual, it does what I want.
When I invoke it in a TCP server in the following way:
shutdown(clntSocket, SHUT_RDWR)
then clients must be able to observe that TCP connection was closed.
I guess, this means that clients must be able to notice that no further data can be sent/received. This is the theory which I am not able to corroborate.
In this simple experiment I define a TCP server and a TCP client. The server receives 3 bytes from the client, then invokes shutdown(2). The client sends 3 bytes and subsequently it sends another 3 bytes. Both send operations succeed. Shouldn't the second send operation fail?
Thanks in advance for the help.
A send operation succeeding just means the data was queued for sending. It doesn't mean it was actually sent or received. After calling shutdown, you can call read if you want to confirm that the other end has completed its part of the shutdown process. Once read returns zero or an error, then you know the connection has been shutdown.
When the server calls shutdown(2) with SHUT_WR or SHUT_RDWR, a TCP packet with the FIN flag is sent. FIN means that the sender will not send any more data. It says nothing about the intent to receive data.
The client has no way to know if the server has called SHUT_RD. It doesn't seem to affect the client in any way.

In case of a blocking recv call, if the peer side system reboots, the call doesn't come of out recv. Why?

When my code is in a blocking recv call, if the other side reboots, then this side recv call doesn't get to know about it and just goes into a hung state.
How to avoid this?
By default, if the other side of the connection disappears without terminating the connection properly, the OS on your side has no way of knowing that no further data will be coming. That's why recv() will block forever in this situation.
If you want to have a timeout, then set the socket to non-blocking and use select() to wait for it to become readable. select() allows you to specify a timeout.
Alternatively, you can set the SO_KEEPALIVE socket option with setsockopt(). This will enable the sending of TCP "keepalives", that will allow your side to detect a stale connection. (Do note that with the default settings, it can take a long time to detect that the connection has gone).

Resources