TCP send/receive packet timeout in Linux

TCP send/receive packet timeout in Linux - c

I’m using raspberry pi b+ and building tcp server/client connection with C.
I have few questions from client side.
How long does Linux queue the packets for client? When the packet has received thru Linux, what if client is not ready to process it or select/epoll func inside loop has 1min sleep? If there is a timeout, is there a way to adjust the timeout with code/script?
What is the internal process inside of Linux when it receives the packet? (i.e., ethernet port->kernel->ram->application??)

The raspberry pi (with linux) and any known linux (or nonlinux) tcp/ip works in some way like this:
You have a kernel buffer in which the kernel stores all the data from the other side, this is the data that has not yet been read by the user process. the kernel normally has all this data acknowledged to the other side (the acknowledge states the last byte received and stored in that buffer) The sender side has also a buffer, where it stores all the sent data that has not yet been acknowledged by the receiver (This data must be resent in case of timeout) plus data that is not yet in the window admitted by the receiver. If this buffer fills, the sender is blocked, or a partial write is reported (depending on options) to the user process.
That kernel buffer (the reading buffer) allows the kernel to make the data available for reading to the user process while the process is not reading the data. If the user process cannot read it, it remains there until de process does a read() system call.
The amount of buffering that the kernel is still capable of reading (known as the window size) is sent to the other end on each acknowledge, so the sender knows the maximum amount of data it is authorized to send. When the buffer is full, the window size descends to zero and the receiver announces it cannot receive more data. This allows a slow receiver to stop a fast sender from filling the network with data that cannot be sent.
From then on (the situation with a zero window), the sender periodically (or randomly) sends a segment with no data at all (or with just one byte of data, depending on the implementation) to check if some window has open to allow it to send more data. The acknowledge to that packet will allow it to start communicating again.
Everything is stopped now, but no timeout happens. both tcps continue talking this way until some window is available (meaning the receiver has read() part of the buffer)
This situation can be mainained for days without any problem, the reading process is busy and cannot read the data, and the writing process is blocked in the write call until the kernel in the sending side has buffer to accomodate the data to be written.
When the reading process reads the data:
An ack of the last sent byte is sent, announcing a new window size, larger than zero (by the amount freed by the reader process when reading)
The sender receives this acknowledge and sends that amount of data from his buffer, if this allows to accomodate the data the writer has requested to write, it will be awaken and allowed to continue sending data.
Again, timeouts normally only occur if data is lost in transit.
But...
If you are behind a NAT device, your connection data can be lost from not exercising it (the nat device maintains a cache of used address/port local devices making connections to the outside) and on the next data transfer that comes from the remote device, the nat device can (or cannot) send a RST, because the packet refers to a connection that is not known to it (the cache entry expired)
Or if the packet comes from the internal device, the connection can be recached and continue, what happens, depends on who is the first to send a packet.
Nothing specifies that an implementation should provide a timeout for data to be sent, but some implementations do, aborting the connection with an error in case some data is timeout for a large amount of time. TCP specifies no timeout in this case, so it is the process resposibility to cope with it.
TCP is specified in RFC-793 and must be obeyed by all implementations if they want communications to succeed. You can read it if you like. I think you'll get a better explanation than the one I give you here.
So, to answer your first question: The kernel will store the data in its buffer as long as your process wants to wait for it. By default, you just call write() on a socket, and the kernel tries as long as you (the user) don't decide to stop the process and abort the operation. In that case the kernel will probably try to close the connection or reset it. The resources are surrogated to the life of the process, so as long as the process is alive and holding the connection, the kernel will wait for it.

Related

Can an ethernet buffer fill up and not allow another process to recv() ethernet packets?

Say you have a process that receives a large file from a server.
If you do not perform a recv() call does it stay on a buffer of your ethernet controller forever?
If another process needs to receive data and the buffer is full from another process does it need to wait until the other process performs a recv() or the buffer times out?
If you have multiple process sending and receiving data does it have to wait until the buffer is empty.? Or can it multiplex it and keep track at the driver level or some part of the socket library?
edit: spelling

If you do not perform a recv() call does it stay on a buffer of your
ethernet controller forever?
No, data never stays in the ethernet controller's buffer for very long; the kernel will read the data out of the Ethernet controller's buffer and into your socket's buffer (in the computer's regular RAM) as quickly as it can. If your socket's buffer is full, then the incoming data will be discarded.
If another process needs to receive data and the buffer is full from
another process does it need to wait until the other process performs
a recv() or the buffer times out?
Each socket has its own separate buffer in the computer's main RAM, and each process has its own socket(s), so processes do not have to wait for each others' buffers to empty.
If you have multiple process sending and receiving data does it have
to wait until the buffer is empty.?
See the answer to question 2, as it answers this question also.

This is a bit of perfectly spherical chicken in a vacuum type of answer. But your question is very broad and has a lot of what ifs depending on the NIC, the OS, and many other things.
But lets assume your are on a modernish full-blown OS, with modernish ethernet controller.
No. That all handled by the by kernel and protocol stuff. The kernel can't let the buffer on the network controller fill up while it's waiting for you. Otherwise it will block other processes from accessing the network. So it will buffer it up until you are ready. For some protocols there are mechanism where one device can tell the other device not to send any more data. (ei. TCP Receive Window Size, once the sender sent that amount of data it will stop until the receiver acknowledges it somehow)
It's basically the same answer as above, the OS handles the details. From your point of you, your recv() will not block any other processes ability to recv().
This is more interesting, modern NIC are queue based. You have n-number of transmit/receive queues, and in most cases, filters can be attached to them. This allows the NIC to do a lot of the functionality that normally would have to be done by the OS (that's called offloading) but back to the point. With these NICs, you have have multiple I/O without multiplexing. Though generally, especially on consumer grade NIC, the number of queues will be pretty low. Usually 4. So there will be some multiplexing involved.

Can one send be broken up into multiple recvs?

I'm learning about C socket programming and I came across this piece of code in a online tutorial
Server.c:
//some server code up here
recv(sock_fd, buf, 2048, 0);
//some server code below
Client.c:
//some client code up here
send(cl_sock_fd, buf, 2048, 0);
//some client code below
Will the server receive all 2048 bytes in a single recv call or can the send be be broken up into multiple receive calls?

TCP is a streaming protocol, with no message boundaries of packets. A single send might need multiple recv calls, or multiple send calls could be combined into a single recv call.
You need to call recv in a loop until all data have been received.

Technically, the data is ultimately typically handled by the operating system which programs the physical network interface to send it across a wire or over the air or however else applicable. And since TCP/IP doesn't define particulars like how many packets and of which size should compose your data, the operating system is free to decide as much, which results in your 2048 bytes of data possibly being sent in fragments, over a period of time.
Practically, this means that by calling send you may merely be causing your 2048 bytes of data be buffered for sending, much like an e-mail in a queue, except that your 2048 bytes aren't even a single piece of anything to the system that sends it -- it's just 2048 more bytes to chop into packets the network will accept, marked with a destination address and port, among other things. The job of TCP is to only make sure they're the same bytes when they arrive, in same order with relation to each other and other data sent through the connection.
The important thing at the receiving end is that, again, the arriving data is merely queued and there is no information retained as to how it was partitioned when requested sent. Everything that was ever sent through the connection is now either part of a consumable stream or has already been consumed and removed from the stream.
For a TCP connection a fitting analogy would be the connection holding an open water keg, which also has a spout (tap) at the bottom. The sender can pour water into the keg (as much as it can contain, anyway) and the receiver can open the spout to drain the water from the keg into say, a cup (which is an analogy to a buffer in an application that reads from a TCP socket). Both sender and receiver can be doing their thing at the same time, or either may be doing so alone. The sender will have to wait (send call will block) if the keg is full, and the receiver will have to wait (recv call will block) if the keg is empty.
Another, shorter analogy is that sender and receiver sit each at their own end of a opaque pipe, with the former pushing stuff in one end and the latter removing pushed stuff out of the other end.

How does packetbuf work in ContikiOS if there's an incoming packet during a pending send?

I have trouble understanding how to write asynchronous sending/receiving in Contiki. Suppose I am using the xmac layer, or any layer that is based on packetbuf. I am sending a message, or a list of packets. I start sending a message using void(*send)(mac_callback_t sent_callback, void *ptr). This takes the message that is in the global buffer packetbuf, and tries to send it. Meanwhile while the send is pending (for example waiting for the other device to wake up or acknowledge the transmission), the device receives a packet from a third device.
Will this packet overwrite the packet waiting to be sent that is in the packetbuf? How should I handle this?
I thought that maybe you can't be trying to send a packets and listen for incoming packets, but then there is an obvious deadlock: 2 devices sending messages to each other at the same time.
I am porting a higher-level routing layer to Contiki. This is the second OS I am porting it to, but the previous OS didn't use a single buffer for both incoming and outgoing packets.

The packetbuf is a space for short-term data and metadata storage. It's not meant to be used by code that blocks longer than a few timer ticks. If you can't send the packet immediately from your send() function, do not block there! You need to schedule a timer callback in the future and return MAC_TX_DEFERRED. To store packet data in between invocations of send(), use the queuebuf module.
The fact that there is a single packetbuf for both reception and transmission is not a problem, since the radio is a half-duplex communication medium anyway. It cannot both send and receive data at the same time. Similarly, a packet that is received is first stored in the radio chip's memory: it does not overwrite the packetbuf. Contiki interrupt handlers similarly never write to packetbuf directly. They simply wake up the rx handler process, which takes the packet from the radio chip and puts it in the packetbuf. Since one process cannot unexpectedly interrupt another, this operation is safe: a processing wanting to send a packet cannot interrupt the process reading another packet.
To summarize, the recommendations are:
Do not block in Contiki process context (this is a generic rule when programming this OS, not specific to this question).
Do not the expect the contents of packetbuf are going to be saved across yielding the execution in Contiki process context. Serialize to a queuebuf if you need this.
Do not access the packetbuf from interrupt context.

Determine if peer has closed reading end of socket

I have a socket programming situation where the client shuts down the writing end of the socket to let the server know input is finished (via receiving EOF), but keeps the reading end open to read back a result (one line of text). It would be useful for the server to know that the client has successfully read the result and closed the socket (or at least shut down the reading end). Is there a good way to check/wait for such status?

No. All you can know is whether your sends succeeded, and some of them will succeed even after the peer read shutdown, because of TCP buffering.
This is poor design. If the server needs to know that the client received the data, the client needs to acknowledge it, which means it can't shutdown its write end. The client should:
send an in-band termination message, as data.
read and acknowledge all further responses until end of stream occurs.
close the socket.
The server should detect the in-band termination message and:
stop reading requests from the socket
send all outstanding responses and read the acknowledgements
close the socket.
OR, if the objective is only to ensure that client and server end at the same time, each end should shutdown its socket for output and then read input until end of stream occurs, then close the socket. That way the final closes will occur more or less simultaneously on both ends.

getsockopt with TCP_INFO seems the most obvious choice, but it's not cross-platform.
Here's an example for Linux:
import socket
import time
import struct
import pprint
def tcp_info(s):
rv = dict(zip("""
state ca_state retransmits probes backoff options snd_rcv_wscale
rto ato snd_mss rcv_mss unacked sacked lost retrans fackets
last_data_sent last_ack_sent last_data_recv last_ack_recv
pmtu rcv_ssthresh rtt rttvar snd_ssthresh snd_cwnd advmss reordering
rcv_rtt rcv_space
total_retrans
pacing_rate max_pacing_rate bytes_acked bytes_received segs_out segs_in
notsent_bytes min_rtt data_segs_in data_segs_out""".split(),
struct.unpack("BBBBBBBIIIIIIIIIIIIIIIIIIIIIIIILLLLIIIIII",
s.getsockopt(socket.IPPROTO_TCP, socket.TCP_INFO, 160))))
wscale = rv.pop("snd_rcv_wscale")
# bit field layout is up to compiler
# FIXME test the order of nibbles
rv["snd_wscale"] = wscale >> 4
rv["rcv_wscale"] = wscale & 0xf
return rv
for i in range(100):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("localhost", 7878))
s.recv(10)
pprint.pprint(tcp_info(s))
I doubt a true cross-platform alternative exists.
Fundamentally there are quite a few states:
you wrote data to socket, but it was not sent yet
data was sent, but not received
data was sent and losts (relies on timer)
data was received, but not acknowledged yet
acknowledgement not received yet
acknowledgement lost (relies on timer)
data was received by remote host but not read out by application
data was read out by application, but socket still alive
data was read out, and app crashed
data was read out, and app closed the socket
data was read out, and app called shutdown(WR) (almost same as closed)
FIN was not sent by remote yet
FIN was sent by remote but not received yet
FIN was sent and got lost
FIN received by your end
Obviously your OS can distinguish quite a few of these states, but not all of them. I can't think of an API that would be this verbose...
Some systems allow you to query remaining send buffer space. Perhaps if you did, and socket was already shut down, you'd get a neat error?
Good news is just because socket is shut down, doesn't mean you can't interrogate it. I can get all of TCP_INFO after shutdown, with state=7 (closed). In some cases report state=8 (close wait).
http://lxr.free-electrons.com/source/net/ipv4/tcp.c#L1961 has all the gory details of Linux TCP state machine.

TL;DR:
Don't rely on the socket state for this; it can cut you in many error cases. You need to bake the acknowledgement/receipt facility into your communications protocol. First character on each line used for status/ack works really well for text-based protocols.
On many, but not all, Unix-like/POSIXy systems, one can use the TIOCOUTQ (also SIOCOUTQ) ioctl to determine how much data is left in the outgoing buffer.
For TCP sockets, even if the other end has shut down its write side (and therefore will send no more data to this end), all transmissions are acknowledged. The data in the outgoing buffer is only removed when the acknowledgement from the recipient kernel is received. Thus, when there is no more data in the outgoing buffer, we know that the kernel at the other end has received the data.
Unfortunately, this does not mean that the application has received and processed the data. This same limitation applies to all methods that rely on socket state; this is also the reason why fundamentally, the acknowledgement of receipt/acceptance of the final status line must come from the other application, and cannot be automatically detected.
This, in turn, means that neither end can shut down their sending sides before the very final receipt/acknowledge message. You cannot rely on TCP -- or any other protocols' -- automatic socket state management. You must bake in the critical receipts/acknowledgements into the stream protocol itself.
In OP's case, the stream protocol seems to be simple line-based text. This is quite useful and easy to parse. One robust way to "extend" such a protocol is to reserve the first character of each line for the status code (or alternatively, reserve certain one-character lines as acknowledgements).
For large in-flight binary protocols (i.e., protocols where the sender and receiver are not really in sync), it is useful to label each data frame with an increasing (cyclic) integer, and have the other end respond, occasionally, with an update to let the sender know which frames have been completely processed, and which ones received, and whether additional frames should arrive soon/not-very-soon. This is very useful for network-based appliances that consume a lot of data, with the data provider wishing to be kept updated on the progress and desired data rate (think 3D printers, CNC machines, and so on, where the contents of the data changes the maximum acceptable data rate dynamically).

Okay so I recall pulling my hair out trying to solve this very problem back in the late 90's. I finally found an obscure doc that stated that a read call to a disconnected socket will return a 0. I use this fact to this day.

You're probably better off using ZeroMQ. That will send a whole message, or no message at all. If you set it's send buffer length to 1 (the shortest it will go) you can test to see if the send buffer is full. If not, the message was successfully transferred, probably. ZeroMQ is also really nice if you have an unreliable or intermittent network connection as part of your system.
That's still not entirely satisfactory. You're probably even better off implementing your own send acknowledge mechanism on top of ZeroMQ. That way you have absolute proof that a message was received. You don't have proof that a message was not received (something can go wrong between emitting and receiving the ack, and you cannot solve the Two Generals Problem). But that's the best that can be achieved. What you'll have done then is implement a Communicating Sequential Processes architecture on top of ZeroMQ's Actor Model which is itself implemented on top of TCP streams.. Ultimately it's a bit slower, but your application has more certainty of knowing what's gone on.

What will be the socket behavior if program is stopped at break point?

Here is the scenario:
I have a select based socket server in Linux which processes single packet a time. Lets say several packets are coming at high speed and I hit a break point while my process is at processing stage of current packet. My question is that what will happen to packets that are being sent to my server process non stop and while it is stopped at a break point. Will they get dropped? or Will OS buffer these packets and deliver to my process when it comes out of break point?
Though I have some idea but I want to confirm it from gurus here and probably I will learn more about socket behavior.
Any help would be appreciated.

The incoming packets will be queued up by the OS kernel until its buffer gets full. Any more packets will simply be dropped, but depending on the type of connection the kernel may signal the other end to stop sending (TCP aka SOCK_STREAM should, UDP aka SOCK_DGRAM probably won't). The sender should be prepared to handle this scenario.
How big the buffer is depends on the system; you may be able to query the size and/or change it (how this is done is usually OS dependent).
It doesn't matter if your process is halted for debugging, just slow, busy waiting for other events or being swapping in; if it does not read the data from the socket it will be queued.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight