Does connect() block for TCP socket? - c

Hi I am reading TLPI (The Linux Programming Interface), I have a question about connect().
As I understand, connect() will immediately return if the pending connection numbers of listen() doesn't reach "backlog".
And it will blocks otherwise. (according to figure 56-2)
But for TCP socket, it will always block until accept() on server side is called (according to figure 61-5).
Am I correct?
Because I saw that in the example code (p.1265), it calls listen() to listen to a specific port and then calls connect() to that port BEFORE calling accept().
So connect() blocks forever in this case, doesn't it?
Thanks!!

There's hardly any "immediately" regarding networking, stuff can be lost on the way, and an operation that should be performed immediately in theory might not do so in practice, and in any case there's the end to end transmission time.
However
connect() on a TCP socket is a blocking operation unless the socket descriptor is put into non-blocking mode.
The OS takes care of the TCP handshake, when the handshake is finished, connect() returns. (that is,
connect() does not block until the other end calls accept())
A successful TCP handshake will be queued to the server application, and can be accept()'ed any time later.

connect() blocks until finishing TCP 3-way handshake. Handshake on listening side is handled by TCP/IP stack in kernel and finished without notifying user process. Only after handshake is completed (and initiator can return from connect() call already), accept() in user process can pick up new socket and return. No waiting accept() needed for completing handshake.
The reason is simple: if you have single threaded process listening for connections and require waiting accept() for establishing connections, you can't respond to TCP SYN's while processing another request. TCP stack on initating side will retransmit, but on moderately loaded server chances are high this retransmitted packet still will arrive while no accept() pending and will be dropped again, resulting in ugly delays and connection timeouts.

connect is a blocking call by default, but you can make it non blocking by passing to socket the SOCK_NONBLOCK flag.

Related

TCP: What happens when client connects, sends data and disconnects before accept

I'm testing some code in C and I've found strange behaviour with TCP socket calls.
I've defined one listening thread which accepts clients synchronously and after accepting the client it process it in a for loop until it disconnects. Thus only one client at a time is handled. So I call accept in a loop and then recv in an inner loop until received an empty buffer.
I fire 5 threads with clients, I call connect, send and finally close
I get no error in any call. Everything seems to be fine.
However when I print received message on the server side it turns out that only the first client got through to the server, i.e. accept never fires on other clients.
So my questions are:
Shouldn't connect wait until server calls accept? Or is the kernel layer taking care of buffering under the hood?
If it's not the case then shouldn't the server be able to accept the socket anyway, even if it is in a disconnected state? I mean is it expected to lose all the incoming data?
Or should I assume that there's a bug in my code?
The TCP state-machine performss a synchronized dance with the client's state machine. All of this is performed at OS-level (The TCP/IP stack); the userspace process only can do some systemcalls to influence this machinery now and then. Once the client calls listen() this machinery is started; and new connections will be establisched.
Remember the second argument for listen(int fd, int backlog) ? The whole 3way handshake is completed (by the TCP stack) before accept() delivers the fd to the server in userland. So: the sockets are in connected state, but the user process hasn't picked them up yet (by calling accept() )
Not calling accept() will cause the new connections to be queued up by the kernel. These connections are fully functional, but obviously the data buffers could fill up and the connection would get throttled.
Suggested reading: Comer& Stevens: Internetworking with TCP/IP 10.6-10.7 (containing the TCP state diagram)

When send()/recv() fails. Where does the socket stands?

Suppose I have a socket which I have created by the socket() system call. After that I did a connect() and started sending and receiving data.
Similarly, on the other side, a socket was created with the socket() system call, and then bind(), listen() and accept() were called. Then this side also started communicating.
Now lets suppose one of the send() (and accordingly recv() on the other side) fails.
What I want to where would that socket stand after the fail?
To communicate again, should I create the socket again and do connect() (bind(), listen() and accept() on the other side) or I can try send() and recv() again? And additionally, what is the best thing to be done in such a scenario?
NOTE: We do not know what is the reason for the send()/recv() to fail. It can be anything from a physical wire break to the other side refusing (maybe using iptables).
What to do depends entirely on why send or recv failed.
When an error is detected, check the value of errno to determine the reason. For example, if the error code is EAGAIN you would attempt the operation again, and if the error code is ECONNRESET you would need to reconnect.
See the man pages for send and recv for more details on what errors may be returned and what you should do about them.

About listen(), accept() in network socket programming(3-way handshaking)

In network socket programming, I know what listen() and accept() do.
But, what I want to know is, in tcp, 3-way, where does the three-way handshaking occur.
Does listen() perform 3-way hand shaking, or is is it accept()?
I mean doing syn(client) // syn/ack(server) // ack(clinet) packet.
Once the application has called listen(), the TCP stack will perform the 3-way handshake for any incoming connections. These connections are queued in the kernel, and accept() then retrieves the next connection from the queue and returns it.
There's a backlog argument to listen, and it specifies how large this queue should be (although I think some implementations ignore this, and use a limit built into the stack). When the queue is full, the stack will no longer perform the handshake for incoming connections; the clients should retry, and their connections will succeed when the queue has room for them.
It's done this way so that the client receives the SYN/ACK as quickly as possible in the normal case (when the backlog queue has room), so it doesn't have to retransmit the SYN.
listen(), listens to requests that came to the server.
When a request is made (assume we use TCP, if you use UDP you will not use listen or accept its not connection oriented protocol like TCP) then it made the 3 way hand shake of TCP, if the server currently handling a request then the request gets moved to a queue. The queue has a size you can specify, the max number of pending request is OS dependent, then there is another queue that accept function, take a request from each time listen, get called and return then return new socket to be used in this connection and the (address, port) of the request to listen().

Socket programming - What's the difference between listen() and accept()?

I've been reading this tutorial to learn about socket programming. It seems that the listen() and accept() system calls both do the same thing, which is block and wait for a client to connect to the socket that was created with the socket() system call. Why do you need two separate steps for this? Why not just use one system call?
By the way, I have googled this question and found similar questions, but none of the answers were satisfactory. For example, one of them said that accept() creates the socket, which makes no sense, since I know that the socket is created by socket().
The listen() function basically sets a flag in the internal socket structure marking the socket as a passive listening socket, one that you can call accept on. It opens the bound port so the socket can then start receiving connections from clients.
The accept() function asks a listening socket to accept the next incoming connection and return a socket descriptor for that connection. So, in a sense, accept() does create a socket, just not the one you use to listen() for incoming connections on.
It is all part of the historic setup. listen prepares socket for the next accept call. Listen also allows one to setup the backlog - the number of connections which will be accepted by the system, and than put to wait until your program can really accept them. Everything which comes after the backlog is full well be rejected by the system right away. listen never blocks, while accept will block (unless the socket is in non-blocking mode) until the next connection comes along. Obviously, this does not have to be two separate functions - it is conceivable that accept() function could do everything listen does.
The above two answers clearly state the difference between accept and listen. To answer your other question - why we need two separate functions?
One use case is, for e.g. if you only want to test if a port is still available/and accessible, you can do so by just listening to the port and then closing it without accepting any connections.
For e.g. https://github.com/coolaj86/golang-test-port uses the listen call for testing a port's availability.
listen() uses a backlog parameter which specifies the maximum number of queued connections and should be at least 0. It's value increases as the server receives a lot of connection requests simultaneously.
accept() waits for incoming connections. When a client connects, it returns a new socket object representing the connection.
Another imperative thing to note is that accept() creates a new socket object which will be used to communicate with the client. It is different from the listening socket that server uses to accept new connections.
Maximum number of sockets allowed per each connection between an application and the TCP/IP sockets interface is 65535.

Graceful Shutdown Server Socket in Linux

I want to be able to stop listening on a server socket in linux and ensure that all connections that are open from a client's point of view are correctly handled and not abruptly closed (ie: receive ECONNRESET).
ie:
sock = create_socket();
listen(sock, non_zero_backlog);
graceful_close(sock);
if thought calling close() and handling already accept'd sockets would be enough but there can be connections that are open in the kernel backlog which will be abruptly closed if you call close() on the server socket.
The only working way to do that (that I have found) is to:
prevent accept() from adding more clients
have a list of the open sockets somewhere and to wait until they are all properly closed which means:
using shutdown() to tell the client that you will no longer work on that socket
call read() for a while to make sure that all the client has sent in
the meantime has been pulled
then using close() to free each client socket.
THEN, you can safely close() the listening socket.
You can (and should) use a timeout to make sure that idle connections will not last forever.
You are looking at a limitation of the TCP socket API. You can look at ECONNRESET as the socket version of EOF or, you can implement a higher level protocol over TCP which informs the client of an impending disconnection.
However, if you attempt the latter alternative, be aware of the intractable Two Armies Problem which makes graceful shutdown impossible in the general case; this is part of the motivation for the TCP connection reset mechanism as it stands. Even if you could write graceful_close() in a way that worked most of the time, you'd probably still have to deal with ECONNRESET unless the server process can wait forever to receive a graceful_close_ack from the client.

Resources