I am writing a client-server program. The server is select()ing on readfd1 waiting for the readiness of readfd1 to be read. If it is ready, server is collecting the data and printing. Everything is fine for a while, but after some time, socket recv() failed with errno set to ETIMEDOUT. Now I want to rewrite my program to thwart these error condtions. So I went through "Unix Network Programming" by Richard Stevens, which states 4 conditions for select() to unblock. Following are the 2 conditions that grab my attention
A. client sent FIN, here return value of `recv()` will be `0`
B. some socket error, here return value of `recv()` will be `-1`.
My question is, Does socket error closes the connection? If that is so, then why is above two conditions are separated. If not, does next recv() on socket work?
If recv() returns 0, the other end has actively and gracefully closed the connection.
If recv() returns -1, there has (possibly) been an error on the connection, and it is no longer usable.
This means you can tell the difference between the peer closing the connection, and an error happening on the connection. The common thing to do in both cases is to close() your end of the socket.
There is 2 more points to consider though:
In the case of recv() returning -1, you should inspect errno, as it might not indicate a real error.
errno can be EAGAIN/EWOULDBLOCK if you have placed the socket in non-blocking mode or it could be EINTR if the system call was interrupted by a signal. All other errno values means the connection is broken, and you should close it.
TCP can operate in half duplex. If the peer has closed only its writing end of the connection, recv() returns 0 at your end. The common thing to do is to consider the connection as finished, and close your end of the connection too, but you can continue to write to it and the other end can continue to read from it. Whether to close just the reading or writing end of a TCP connection is controlled by the shutdown() function.
A socket error doesn't have to mean that the connection is closed, consider for example what happens if somehow the network cable between you and your peer is cut then you would typically get that ETIMEDOUT error. Most errors are unrecoverable, so closing your end of the connection on error is almost always advisable.
The difference between the two states when select can unblock, is because either the other end closed their connection in a nice way (the first case), or that there are some actual error (the second case).
Related
I'm a newbie to socket programming, I know it's a bad habit to close socket using "control-c", but why socket on the receiving peer keeps receiving '' infinitely after I use "control-c" to close the sending process? shouldn't the socket on the sending peer be closed after "control-c" to exit the process? Thanks!
I know it's a bad habit to close socket using "control-c"
That closes the entire process, not just a socket.
why socket on the receiving peer keeps receiving '' infinitely after I use "control-c" to close the sending process?
At a guess, which is all that is possible without seeing the code you should have posted in your question, you are ignoring errors and end-of-stream when calling recv().
shouldn't the socket on the sending peer be closed after "control-c" to exit the process?
It is. The whole process is 'closed', including all its resources.
As regards the receiving socket, it is up to you to detect the conditions under which it should be close, and close it.
No code given, but here's an educated guess of what might be going on:
You have two separate bits of code running: Sending and receiving
You are in the process of transferring data when you use CTL+C to kill the sending socket.
You expect the receiving socket to stop, but it doesn't.
The issue could be one of "end of transmission" agreement. If the sending code fires off an End-of-File (EOF) (or abort or terminated) when you hit CTL+C, then the receiving socket should see that and quit receiving. However, you haven't stated what the sending code is doing the moment you hit CTL+C.
The receiving socket may just be waiting for more data; as far as the receiving code is concerned, it was to be told when transfer is done, and it is patiently waiting for more information.
There are far better socket programmers around than I am, but I think it's safe to say that, once you get down to that level, you should pay attention to the details of the transfer protocol. If CTL+C just terminates the server (sending) code, then the client has no idea if there is a real termination, an unexpected delay in the transmission, or the server process just had a brain-fart and will start sending again once things clear up.
If you have any means of monitoring the actual values going back an forth, take a look at what happens during a "normal" termination of data transfer and a CTL+C termination. This might help you zero in on the undesirable behavior.
I have a number of C programs running on a Linux host (RHEL 6.6). They have TCP/IP connections to other applications on the same host. There is very little traffic over each connection. Every once in awhile, a read() call on one of the sockets to a process on the same host returns 0. These sockets are normally kept up for the lifetime of the application, so they are not cleanly closed during normal operations. I would expect that if an error occurs such as the other end crashing, the read() will return -1 and set errno.
So, the question is - is there any reason other than the TCP/IP connection being closed cleanly (shutdown(fd); close(fd)) by the other end that would cause the read() call to return 0?
The man page for read() states that 0 is only returned for EOF, while the recv() man page states it returns "0 when the peer has performed an orderly shutdown". I'd assume that the return from read() and recv() would be equivalent, and EOF on a TCP/IP connection implies a clean shutdown.
So, the question is - is there any reason other than the TCP/IP
connection being closed cleanly (shutdown(fd); close(fd)) by the other
end that would cause the read() call to return 0
For starters the process dying would free up the file descriptor which would have the same effect: a clean connection close.
I would expect that if an error occurs such as the other end crashing,
the read() will return -1 and set errno.
It depends what you mean by "crashes". If the process for example dies but the OS is still fine, then as far as TCP is concerned everything is OK and it can just close the connection for the now-orphaned socket (same as above in other words).
Side note: there are of course ways in which your recv can return -1 because of a misbehaving peer (for example the peer can force a TCP reset).
From the manual of epoll_ctl:
EPOLLRDHUP (since Linux 2.6.17)
Stream socket peer closed connection, or shut down writing half of connection. (This flag is especially useful for writing simple code to detect peer shutdown when using Edge Triggered monitoring.)
From the manual of recv:
If no messages are available to be received and the peer has performed an orderly shutdown, recv() shall return 0.
It seems to me then that both of the above cover the same scenarios, and that as long as I catch EPOLLRDHUP events first, I should never receive a read() or recv() of length 0 (and thus don't need to bother checking for such). But is this guaranteed to be true?
If you get an event with EPOLLRDHUP=1 then just close the connection right away without reading. If you get an event with EPOLLRDHUP=0 and EPOLLIN=1 then go ahead and read, but you should be prepared to handle the possibility of recv() still returning 0, just in case. Perhaps a FIN arrives after you got EPOLLIN=1 but before you actually call recv().
I am using a server that is crashing following a call to recv() returning -1 and errno set to ECONNRESET. I originally found this condition using nmap (I'm not a cracker, was just testing if the port was open at the time.) However, nmap uses raw sockets so I'm not too happy submitting this as a test case to the developers. I would rather write a client program in C that can cause the ECONNRESET.
So far I have tried two things: connect() to the server from my client and then shutdown() the socket immediately after connecting. recv() on the server still returned 1 (I have inserted debugging code so I can see the return value.) I also tried calling send() with some string and then immediately calling shutdown(). No dice, the string was transmitted fine.
So how would I cause this condition? Non portable is fine, I am using Linux.
The problem is that you are calling shutdown. Call close instead.
Take a look at a TCP state diagram.
http://tangentsoft.net/wskfaq/articles/debugging-tcp.html
Basically, shutdown closes a socket "politely" by sending a FIN and waiting for the peer to finish (FIN -> ACK/FIN -> ACK -> closed), at which point you call close and all is good. If you call close without calling shutdown first, it's the "impolite" version which sends a RST -- the equivalent of hanging up in the middle of a phone call, without waiting for the other person to finish what they're saying.
Think of "shutdown" as "say goodbye", and "close" as "hang up". You always have to hang up, but you don't have to say goodbye first.
About nmap: It is perfectly acceptable to give developers a test case with nmap. That's one of the main purposes of nmap anyway.
Your instincts were correct to use shutdown(), however you were not using it correctly for this.
Presumably you are trying shutdown() with SHUT_WR or SHUT_RDWR. When you close the writing direction, as these do, your side of the connection notifies the peer with a FIN - indicating that no more data will be forthcoming from your side. This will cause recv() on the other side to indicate a clean end-of-file on the connection, which isn't what you want in this case.
Instead, you want to use SHUT_RD to shutdown the reading direction of the socket only, and hold it open for writing. This will not notify the peer immediately - but if the peer sends any data after this point, your side will respond with a RST, to inform the peer that some data was lost - it wasn't seen by your client application.
(So, to ensure that you get a connection reset, you need to make sure that the server will be trying to send something to you - you might need to send something first, then perform the reading shutdown).
In C, I understood that if we close a socket, it means the socket will be destroyed and can be re-used later.
How about shutdown? The description said it closes half of a duplex connection to that socket. But will that socket be destroyed like close system call?
This is explained in Beej's networking guide. shutdown is a flexible way to block communication in one or both directions. When the second parameter is SHUT_RDWR, it will block both sending and receiving (like close). However, close is the way to actually destroy a socket.
With shutdown, you will still be able to receive pending data the peer already sent (thanks to Joey Adams for noting this).
None of the existing answers tell people how shutdown and close works at the TCP protocol level, so it is worth to add this.
A standard TCP connection gets terminated by 4-way finalization:
Once a participant has no more data to send, it sends a FIN packet to the other
The other party returns an ACK for the FIN.
When the other party also finished data transfer, it sends another FIN packet
The initial participant returns an ACK and finalizes transfer.
However, there is another "emergent" way to close a TCP connection:
A participant sends an RST packet and abandons the connection
The other side receives an RST and then abandon the connection as well
In my test with Wireshark, with default socket options, shutdown sends a FIN packet to the other end but it is all it does. Until the other party send you the FIN packet you are still able to receive data. Once this happened, your Receive will get an 0 size result. So if you are the first one to shut down "send", you should close the socket once you finished receiving data.
On the other hand, if you call close whilst the connection is still active (the other side is still active and you may have unsent data in the system buffer as well), an RST packet will be sent to the other side. This is good for errors. For example, if you think the other party provided wrong data or it refused to provide data (DOS attack?), you can close the socket straight away.
My opinion of rules would be:
Consider shutdown before close when possible
If you finished receiving (0 size data received) before you decided to shutdown, close the connection after the last send (if any) finished.
If you want to close the connection normally, shutdown the connection (with SHUT_WR, and if you don't care about receiving data after this point, with SHUT_RD as well), and wait until you receive a 0 size data, and then close the socket.
In any case, if any other error occurred (timeout for example), simply close the socket.
Ideal implementations for SHUT_RD and SHUT_WR
The following haven't been tested, trust at your own risk. However, I believe this is a reasonable and practical way of doing things.
If the TCP stack receives a shutdown with SHUT_RD only, it shall mark this connection as no more data expected. Any pending and subsequent read requests (regardless whichever thread they are in) will then returned with zero sized result. However, the connection is still active and usable -- you can still receive OOB data, for example. Also, the OS will drop any data it receives for this connection. But that is all, no packages will be sent to the other side.
If the TCP stack receives a shutdown with SHUT_WR only, it shall mark this connection as no more data can be sent. All pending write requests will be finished, but subsequent write requests will fail. Furthermore, a FIN packet will be sent to another side to inform them we don't have more data to send.
There are some limitations with close() that can be avoided if one uses shutdown() instead.
close() will terminate both directions on a TCP connection. Sometimes you want to tell the other endpoint that you are finished with sending data, but still want to receive data.
close() decrements the descriptors reference count (maintained in file table entry and counts number of descriptors currently open that are referring to a file/socket) and does not close the socket/file if the descriptor is not 0. This means that if you are forking, the cleanup happens only after reference count drops to 0. With shutdown() one can initiate normal TCP close sequence ignoring the reference count.
Parameters are as follows:
int shutdown(int s, int how); // s is socket descriptor
int how can be:
SHUT_RD or 0
Further receives are disallowed
SHUT_WR or 1
Further sends are disallowed
SHUT_RDWR or 2
Further sends and receives are disallowed
This may be platform specific, I somehow doubt it, but anyway, the best explanation I've seen is here on this msdn page where they explain about shutdown, linger options, socket closure and general connection termination sequences.
In summary, use shutdown to send a shutdown sequence at the TCP level and use close to free up the resources used by the socket data structures in your process. If you haven't issued an explicit shutdown sequence by the time you call close then one is initiated for you.
I've also had success under linux using shutdown() from one pthread to force another pthread currently blocked in connect() to abort early.
Under other OSes (OSX at least), I found calling close() was enough to get connect() fail.
"shutdown() doesn't actually close the file descriptor—it just changes its usability. To free a socket descriptor, you need to use close()."1
Close
When you have finished using a socket, you can simply close its file descriptor with close; If there is still data waiting to be transmitted over the connection, normally close tries to complete this transmission. You can control this behavior using the SO_LINGER socket option to specify a timeout period; see Socket Options.
ShutDown
You can also shut down only reception or transmission on a connection by calling shutdown.
The shutdown function shuts down the connection of socket. Its argument how specifies what action to perform:
0
Stop receiving data for this socket. If further data arrives, reject it.
1
Stop trying to transmit data from this socket. Discard any data waiting to be sent. Stop looking for acknowledgement of data already sent; don’t retransmit it if it is lost.
2
Stop both reception and transmission.
The return value is 0 on success and -1 on failure.
in my test.
close will send fin packet and destroy fd immediately when socket is not shared with other processes
shutdown SHUT_RD, process can still recv data from the socket, but recv will return 0 if TCP buffer is empty.After peer send more data, recv will return data again.
shutdown SHUT_WR will send fin packet to indicate the Further sends are disallowed. the peer can recv data but it will recv 0 if its TCP buffer is empty
shutdown SHUT_RDWR (equal to use both SHUT_RD and SHUT_WR) will send rst packet if peer send more data.
linux: shutdown() causes listener thread select() to awake and produce error. shutdown(); close(); will lead to endless wait.
winsock: vice versa - shutdown() has no effect, while close() is successfully catched.