recovering transmission from lost TCP connection - c

I am working on client server application written in C for Linux where I am replicating the data to multiple slave replicas using TCP and I would like to know how to deal with unexpected temporary shutdown of some replica (it may be the crash of the unix process or hardware power off).
When I issue the write() syscall to the kernel, the successful return means the data was copied to the socket, but doesn't mean that the receiving end got the data. If the destination is powered off and then powered up, the data must be resent (after establishing a new TCP connection) to the replica from the point where it lost the data.
Lets say I am working with large data amounts and I don't keep the data that I already sent (i.e. the write() syscall returned success). I keep only the pending data to be sent.
When the replica recovers from the unexpected shutdown and connects again, how do I get, from kernel, the data that has been written to the socket, but wasn't 'ack'-nowledged on the destination host, yet?
Or in other words, how do I recover from a loss of a TCP connection and reestablish transmission between client and server from the point where it stopped?

You need to add another level of abstraction on top of TCP. After every piece of data is sent (TCP guarantees that it will get there intact and in order), have the process at the other end send it's own kind of ACK, in your own higher level protocol (whatever that is -- be it "ACK\0", "GOT\n" or anything else). On the other side (the originator), read for this data. If it comes through good without error, everything's fine. If you get an error -- check the type. If you get ECONNRESET, that means that the remote end is dead. From this, you can respond accordingly. Wait until you can reconnect, and repeat the data send all over again.

There is no way to do what you want through the standard API.
A solution could be to have you client periodically send back a running total of bytes received and verified written to disc, and then keep a buffer of sent but not acknowledge data on the server. Then when the client reconnects, it sends it last good count, and the server knows where to start retransmitting.

TCP will take care of the sequence numbers needed for TCP, you can't make much use of those at the application level
You need some sequence control at the application level.
In your case here, you could assign a number to each block of data you sent. The destination need to keep persistent track of the last block number it has received. On startup from an unexpected shutdown, the destination need to communicate back the last blocknumber it processed, and you start sending from there.
how do I get from the kernel the data that has been written to the socket, but wasn't 'ack'-nowledged on the destination host yet?
Even if you could, this would not be enough. The destination host might very well have acked' the data, but for whatever reason the ack could be lost, or never sent, but the destination application could have received and processed that data fine. So if you use the TCP sequence number in this case, you'd end up with duplicated data.
An other case is that TCP sent back an ack for the data, and the destination application crashed/shutdown just as it read that data, but right before it wrote it to disk. So you end up with lost data.

Related

Minimising client processing - c socket programming

I am working on a client/server model based on Berkeley sockets and have almost finished but I'm stuck with a way to know that all of the data has been received whilst minimising the processing being executed on the client side.
The client I am working with has very little memory and battery and is to be deployed in remote conditions. This means that wherever possible I am trying to avoid processing (and therefore battery loss) on the client side. The following conditions on the client are outside of my control:
The client sends its data 1056 bytes at a time until it has ran out of data to send (I have no idea where the number 1056 came from but if you think that you know I would be very interested)
The client is very unpredictable in when it will send the data (it is attached to a wild animal and sends data determined by connection strength and battery life)
The client has an unknown amount of data to send at any given time
The data is transmitted though a GRSM enabled phone tag (Not sure that this is relevant but I'm assuming that extra information could only help)
(I am emulating the data I am expecting to receive from the client through localhost, if it seems to work I will ask the company where I am interning to invest in a static ip address to allow "real" tcp transfers, if it doesn't I won't. I don't think this is relevant but, again, I would rather provide too much information than too little)
At the moment I am using a while loop and incrementing the number of bytes received in order to "recv()" each of the 1056 byte sections. My problem is that the server needs to receive an unknown number of these. To me, the most obvious solutions are to send the number of sections to be received in an initial header from the client or to mark the last section being sent in some way. However, both of these approaches would require processing on the client side, I was wondering if there was a way to check whether the client has closed its socket from the server side? Or even whether something like closing the connection from the server after a pre-determined period of time without information from the client would be feasible? If these aren't possible then I would love to hear any other suggestions.
TLDR: What condition can I use here to minimise client-side processing?
while(!(/* Client has ran out of data to send*/)) {
receive1056Section();
}
Also, I know that it is bad practise to make a stackOverflow account and immediately ask a question, I didn't know what else to do, I'm sorry. Please don't hesitate to be mean if I've missed something very obvious.
Here is a suggestion for how to do the interaction:
The client:
Client connects to server via tcp.
Client sends chunks of data until all data has been sent. Flush the send buffer after each chunk.
When it is done the client issues a shutdown on the socket, sleeps for a couple of seconds and then closes the connection.
The client then sleeps until the next transmission. If the transmission was unsuccessful, the sleep time should be shorter to prevent unsent data to overflow the avaiable memory.
If the client is unable to connect for an extended period of time, you would have to discard data that doesn't fit in the memory.
I am assuming that sleep reduces power consumption.
The server:
The server programcan be single-threaded unless you need massive scalability. It is listening for incoming connections on the agreed port.
Whenever a client connects, a new socket is created.
Use select() to see which sockets has data (don't forget to include the listening socket!), and non-blocking reads to read from the sockets.
When you get the appropriate error (no more data to read and the other side has shutdown it's side of the connection), then you can close that socket.
This should work fine up to a couple of thousand simultaneous connections.
Example that handles many of the difficulties of implementing a server

Reliable check if tcp packet has been delivered [duplicate]

When i send()/write() a message over a tcp stream, how can i find out if those bytes were successfully delivered?
The receiver acknowledges receiving the bytes via tcp, so the senders tcp stack should know.
But when I send() some bytes, send() immediately returns, even if the packet could not (yet) be delivered, i tested that on linux 2.6.30 using strace on netcat, pulling my network cable out before sending some bytes.
I am just developing an application where it is very important to know if a message was delivered, but implementing tcp features ("ack for message #123") feels awkward, there must be a better way.
The sending TCP does know when the data gets acknowledged by the other end, but the only reason it does this is so that it knows when it can discard the data (because someone else is now responsible for getting it to the application at the other side).
It doesn't typically provide this information to the sending application, because (despite appearances) it wouldn't actually mean much to the sending application. The acknowledgement doesn't mean that the receiving application has got the data and done something sensible with it - all it means is that the sending TCP no longer has to worry about it. The data could still be in transit - within an intermediate proxy server, for example, or within the receiving TCP stack.
"Data successfully received" is really an application-level concept - what it means varies depending on the application (for example, for many applications it would only make sense to consider the data "received" once it has been synced to disk on the receiving side). So that means you have to implement it yourself, because as the application developer, you're really the only one in a position to know how to do it sensibly for your application.
Having the receiver send back an ack is the best way, even if it "feels awkward". Remember that IP might break your data into multiple packets and re-assemble them, and this could be done multiple times along a transmission if various routers in the way have different MTUs, and so your concept of "a packet" and TCP's might disagree.
Far better to send your "packet", whether it's a string, a serialized object, or binary data, and have the receiver do whatever checks it needs to do to make it it's there, and then send back an acknowledgement.
The TCP protocol tries very hard to make sure your data arrives. If there is a network problem, it will retransmit the data a few times. That means anything you send is buffered and there is no timely way to make sure it has arrived (there will be a timeout 2 minutes later if the network is down).
If you need a fast feedback, use the UDP protocol. It doesn't use any of the TCP overhead but you must handle all problems yourself.
Even if it got as far as the TCP layer, there's no guarantee that it didn't sit in the application's buffer, then the app crashed before it could process it. Use an acknowledgement, that's what everything else does (e.g. SMTP)
Application layer has no control over the notifications at lower layers (such as the Transport layer) unless they are specifically provided - this is by design. If you want to know what TCP is doing on a per packet level you need to find out at the layer that TCP operates at; this means handling TCP headers and ACK data.
Any protocol you end up using to carry your payload can be used to pass messages back and forth by way of that payload, however. So if you feel awkward using the bits of a TCP header to do this, simply set it up in your application. For instance:
A: Send 450 Bytes
B: Recv 450 Bytes
B: Send 'B received 450 Bytes'
A: Recv 'B received 450 Bytes'
A: Continue
This sounds like SCTP could be something to look at; I think it should support what you want. The alternative seems to be to switch to UDP, and if you're switching protocols anyway…

how does non-blocking tcp socket notify application on packets which fail to get sent.

Im working on a non-blocking C tcp sockets for linux system. I've read that in non-blocking mode, the "send" command will return "bytes sent" immediately if there is no error. I'm guessing this value returned does not actually mean that those data have been delivered to the destination but rather the data has been passed to kernel memory for it to handle further and send.
If that is the case, how would my application know which packet has really been sent out by kernel to the other end, assuming that the network connection had some problems and kernel decides to give up only after several retries in a span of a few minutes later?
Im asking because i would want my application to resend those failed packets again at a later time.
If that is the case, how would my application know which packet has
really been sent out by kernel to the other end, assuming that the
network connection had some problems and kernel decides to give up
only after several retries in a span of a few minutes later?
Your application won't know, unless it is able to recontact the receiving application and ask the receiving application about what data it had previously received.
Keep in mind that even with blocking I/O your application doesn't block until the data is received by the remote application -- it only blocks until there is some room in the kernel's outgoing-data buffer to hold the bytes you asked the TCP stack to send(). So even with blocking I/O you would face the same issue.
Also keep in mind that the byte arrays you pass to send() do not have a guaranteed 1-to-1 correspondence to the TCP packets that the TCP stack sends out. The TCP stack is free to pack your bytes into TCP packets any way it likes (e.g. the data from multiple send() calls can end up in a single TCP packet, or the data from a single send() call can end up in multiple TCP packets, or any other combination you can think of). Depending on network conditions, TCP stacks can and do pack things various different ways, their only promise is that the bytes will be received in FIFO order (if they get received at all).
Anyway, the answer to your question is: you can't know, unless you later ask the receiving program about what it got (or didn't get).
TCP internally takes care of retrying, application doesn't need to do any special handling for it. If you wish to confirm a packet received the other end of the TCP stack then you can set the send socket buffer (setsockopt(SOL_SOCKET, SO_SNDBUF)) to zero. In this case, kernel uses your application buffer to send the data & its only released after the TCP receives acknowledgement for this data. This way you can confirm that the data is pushed to the receiver end of the TCP stack. It doesn't confirm that the application has received the data. You need to have application layer acknowledgement in your protocol to confirm that the data reached the receiver application.

How to catch disconnect event?

I'm real noob in C. I'm trying to develop my own lock-server in C (just for practice). And I have a question... Let's imagine that we have server written in C, we have remote host connected to this server via socket. When connection being initiated - my server has created pointer in memory. Is it possible to remove this pointer when remote host has disconnected? How can I catch disconnect event?
Thank you
In a real world io scenario, you cannot truly detect the disconnection. Instead you must:
Receive a packet that indicates the other side intends to disconnect.
Attempt to transmit a package which will fail to be delivered due to changes in the connectivity during the "silent" period between communications.
This means that systems which "must" ensure connectivity typically send and receive periodic "dummy" messages to detect the loss of the connection sooner than it would be detected by "regular" traffic alone.
Depending on your application the overhead of the keep-alive messages may not be worth the effort.
The "connection" you have on your side of the network is really just a bunch of data structures which allow you to transmit and receive. The lower "IP" layer of "TCP/IP" is connectionless, that means you will not know if your simulated "connection" is available until you attempt to use it (or receive a package telling you explicitly that the other end will not process any more data).
The read(2) system call will return zero when the other end of the socket closes the connection.

Is there a way to tell the OS to drop any buffered outgoing TCP data?

I've got an amusing/annoying situation in my TCP-based client software, and it goes like this:
my client process is running on a laptop, and it is connected via TCP to my server process (which runs on another machine across the LAN)
irresponsible user pulls the Ethernet cable out of his laptop while the client is transmitting TCP data
client process continues calling send() with some additional TCP data, filling up the OS's SO_SNDBUF buffer, until...
the client process is notified (via MacOS/X's SCDynamicStoreCallback feature) that the ethernet interface is down, and responds by calling close() on its TCP socket
two to five seconds pass...
user plugs the Ethernet cable back in
the client process is notified that the interface is back up, and reconnects automatically to the server
That all works pretty well... except that there is often also an unwanted step 8, which is this:
.8. The TCP socket that was close()'d in step 4 recovers(!) and sends the remainder of the data that was in the kernel's outbound-data buffer for that socket. This happens because the OS tries to deliver all of the outbound TCP data before freeing the socket... usually a good thing, but in this case I'd prefer that that didn't happen.
So, the question is, is there a way to tell the TCP layer to drop the data in its SO_SNDBUF? If so, I could make that call just before close()-ing the dead socket in step 4, and I wouldn't have to worry about zombie data from the old socket arriving at the server after the old socket was abandoned.
This (data recieved from two different TCP connections is not ordered with respect to each other) is a fundamental property of TCP/IP. You shouldn't try and work around it by clearing the send buffer - this is fragile. Instead, you should fix the application to handle this eventuality at the application layer.
For example, if you recieve a new connection on the server side from a client that you believe is already connected, you should probably drop the existing connection.
Additionally, step 4 of the process is a bit dubious. Really, you should just wait until TCP reports an error (or an application-level timeout occurs on the connection) - as you've noticed, TCP will recover if the physical disconnection is only a brief one.
If you want to discard any data that is awaiting transmission when you close the socket at step 4 then simply set SO_LINGER to 0 before closing.

Resources