Purpose of SHUT_RDWR in shutdown() function in socket programming [duplicate] - c

This question already has answers here:
close vs shutdown socket?
(9 answers)
Closed 11 months ago.
When we call the shutdown() function on a socket with argument SHUT_RDWR, we stop read/write possibility on the socket, but the socket is still not destroyed. I can't understand the purpose of SHUT_RDWR. What does it give to us, and why do we need a socket without read/write possibility? I expect SHUT_RDWR should work as the close() function, but it is not.

What SHUT_RDWR does is it closes the connection on the socket while keeping the handle for the data (the file descriptor) open.
For example, suppose a socket has been used for two-way communication and your end of the connection shuts it down with SHUT_RDWR. Then if the person on the other end of the connection tries to write (and perhaps read as well?), it will fail because the connection has been closed.
However, the file descriptor is still open, which means that if there is any data left for you to receive/read, you can go ahead and do it, and then close the file descriptor.
In other words, shutdown() is about closing off the connection, and close() is about closing the file descriptor and discarding any data that may be in the file description's "buffer."
Advanced Programming in the UNIX Environment also has this reason:
Given that we can close a socket, why is shutdown needed? There are several
reasons. First, close will deallocate the network endpoint only when the last active
reference is closed. If we duplicate the socket (with dup, for example), the socket won’t
be deallocated until we close the last file descriptor referring to it. The shutdown
function allows us to deactivate a socket independently of the number of active file
descriptors referencing it.

Related

closesocket in parent and child? [duplicate]

This question already has an answer here:
Are TCP SOCKET handles inheritable?
(1 answer)
Closed 1 year ago.
As far as I understand, if you create a socket on POSIX-related systems, then spawn a child which inherits that socket fd, apparently you should call close on that fd in each process when that process is done with it, as the fd is internally reference counted, and I suppose that reference count is incremented on fork, and close then decrements it again.
Is this the same with Windows, should you call closesocket in each process that has a copy of that socket, or do you only call it in one process, when all processes are done with it?
Edit: the question marked as a duplicate of this one is relevant (Are TCP SOCKET handles inheritable?), as it suggests that inheriting SOCKETs is error-prone, but it also suggests that it's still possible. In which case, if one persists regardless of the possible errors, then it would be good if this question had an answer; should one closesocket in all processes, or just the parent?
Second edit: I believe this question is answered in the documentation for WSADuplicateSocketA:
A process can call closesocket on a duplicated socket and the descriptor will become deallocated. The underlying socket, however, will remain open until closesocket is called by the last remaining descriptor.
You shouldn't see sockets on Windows as file descriptors, so they don't get inherited the same way. See this link for the full answer to your question: Are TCP SOCKET handles inheritable?

How should I handle file descriptor 'dependencies' when using epoll?

I'm writing an HTTP/2 server in C, using epoll. Let's say a client asks for /index.html - I need to open a file descriptor pointing to that file and then send it back to the socket whenever I read a chunk of it. So I'd have an event loop that looks something like this:
while (true)
events = epoll_wait()
for event in events
if event is on a socket
handle socket i/o
else if event is on a disk file
read as much as possible, and send to associated socket
However this poses a problem. If the socket then closes (for whatever reason), the file descriptor for index.html will also get closed too. But it's possible that the index.html FD will have already been queued for reading (i.e, it's already in events, since you closed it between calls to epoll_wait), and as such when the for loop gets to processing the FD I'll now be accessing a 'dangling' FD.
If this was a single threaded program I'd try and hack around the issue by looking at file descriptor numbers, but unfortunately I'm running the same epoll loop on multiple threads which means that I can't predict what FD numbers will be in use at a given moment. It's totally possible that by the time the invalid read on the file comes around, another thread will have claimed that FD, so the call to read won't explicitly fail but I'll probably get a use after free anyway for trying to send it on a socket that doesn't exist anymore.
What's the best way of dealing with this issue? Maybe I should take an entirely different approach and not have file I/O on the same epoll loop at all.

Active close vs passive close in terms of socket API?

In TCP we say one side of the connection performs an "active close" and the other side performs a "passive close".
In terms of the Linux sockets API, how do you differentiate the active close and the passive close?
For example, suppose we have two connected Linux TCP sockets, A and P, that have exchanged information over the application-level protocol and they are both aware that it is time to close their sockets (neither expect to send or receive any more data to or from each other).
We want socket A to perform the active close, and for P to be the passive close.
There are a few things A and P could do. For example:
call shutdown(SHUT_WR)
call recv and expect to get 0 back
call close.
something else
What combination of these things and in what order should A do?... and what combination of these things and in what order should P do?
In terms of the Linux sockets API, how do you differentiate the active
close and the passive close?
The 'active' close is simply whichever side of the socket sends a FIN or RST packet first, typically by calling close().
What combination of these things and in what order should A do?... and
what combination of these things and in what order should P do?
In practice, most of this is application- and application-protocol specific. I will describe the minimum/typical requirement to answer your question, but your mileage may vary depending on what you are specifically trying to accomplish.
You may first call shutdown() on Socket A if you want to terminate communication in one direction or the other (or both) on Socket A. From your description, both programs already know they're done, perhaps due to application protocol messages, so this may not be necessary.
You must call close() on Socket A in order to close the socket and release the file descriptor.
On Socket P, you simply keep reading until recv() returns 0, and then you must call close() to close the socket and release the file descriptor.
For further reading, there are a number of good tutorials out there, and Beej's Guide to Network Programming is quite popular.
Active open is when you issue connect(2) explicitly to make a connection to a remote site. The call blocks until you get the socket opened on the other side (except if you issued O_NONBLOCK fcntl(2) call before calling connect(2).
Passive open is when you have a socket listen(2)ing on a connection and you have not yet issued an accept(2) system call. The accept(2) call normally blocks until you have a completely open connection and gives you a socket descriptor to communicate over it, or gives you inmediately a socket descriptor if the connection handshake has already finished when you issue the accept(2) syscall (this is a passive open). The limit in the number of passively open connections the kernel can accept on your behalf while you prepare yourself to make the accept(2) system call is what is called the listen(2) value.
Active close is what happens when you explicitly call shutdown(2) or close(2) system calls. As with passive open, there's nothing you can do to make a passive close (it's something that happens behind the scenes, product of other side's actions). You detect a passive close when the socket generates an end of file condition (this is, read(2) always returns 0 bytes on reading) meaning the other end has done a shutdown(2) (or close(2)) and the connection is half (or full) closed. When you explicitly shutdown(2) or close(2) your side, it's an active close.
NOTE
if the other end does an explicit close(2) and you continue writing on the socket, you'll get an error due to the impossibility of sending that data (in this case we can talk about a passive close(2) ---one that has occured without any explicit action from our side) but the other end can do a half close calling shutdown(2). This makes the tcp to send a FIN segment only and conserves the socket descriptor to allow the thread to receive any pending data in transit or not yet sent. Only when it receives and acknowledges the other end's FIN segment will it signal you that no more data remains in transit.

accept() returns same socket descriptor numbers [duplicate]

This question already has an answer here:
C : "same file descriptors of all client connections" (client server programming)
(1 answer)
Closed 8 years ago.
As the argument of accept() for new client socket,
the listener socket is in shared memory area and is shared by all forked server processes.
but each server processesaccept()returns the same socket descriptor afteraccept()` is called by all different forked processes.
Does the fork() also makes separate area for socket descriptors and each forked process
manage the area separately?
Is that why they produce duplicate socket descriptors?
I intended to use select() to detect changes on all socket descriptors,
but because they produce all same descriptors, I couldn't make it out..
Yes, socket descriptors' (as well as file descriptors) values are managed on a per-process base.

Blocking recv doesn't exit when closing socket from another thread?

In Linux if we call blocking recv from one thread and close for the same socket from another thread, recv doesn't exit.
Why?
The "why" is simply that that's how it works, by design.
Within the kernel, the recv() call has called fget() on the struct file corresponding to the file descriptor, and this will prevent it from being deallocated until the corresponding fput().
You will simply have to change your design (your design is inherently racy anyway - for this to happen, you must have no locking protecting the file descriptor in userspace, which means that the close() could have happened just before the recv() call - and the file descriptor even been reused for something else).
If you want to wake up another thread that's blocking on a file descriptor, you should have it block on select() instead, with a pipe included in the file descriptor set that can be written to by the main thread.
Check that all file descriptors for the socket have been closed. If any remain open at the "remote end" (assuming this is the one you attempt to close), the "peer has not performed an orderly shutdown".
If this still doesn't work, call shutdown(sock, SHUT_RDWR) on the remote end, this will shut the socket down regardless of reference counts.

Resources