why TCP keep-alive packet doesn't trigger I/O event? Is it because no payload or sequence number is 1 less than sequence number of connection - c

I want to let my application layer notified when my server received Keep Alive Packet. I am wondering what's the reason Keep Alive packet doesn't trigger I/O event. Is it because the TCP Keep Alive packet has no data or sequence number is 1 less than the sequence number of connection.
I did some test to let my client sent Keep Alive Packets. My server use epoll but didn't get triggered.
I am also wondering if I pad one byte to Keep Alive packet data/payload, will my application get notified/ I/O event / Epoll triggered?

You should not be surprised by that. For example, you are not notified of RST packets either.
Those are transport-level messaging details. On the application level, TCP gives you stream of bytes, independent of low-level details. If you want to have application-level heartbeats, you should implement them on the application level protocols.
Your latest edit seems to be stemming from some sort of confusion. You can't add data into Keep Alive packets, for two reasons:
First, they are sent by network layer and application doesn't have control over them (beside timeouts)
More importantly, if by some (dark) magic you manage to interfere with network layer (say, you patch your kernel :) and start putting data into them, they will stop being keep alive packets, and will become normal data packets, carrying data. Than, of course, your receiver will be notified of the data, which will become part of the message stream.

Related

Problem receiving multicast traffic from several groups on one socket

I am working on an application in C that listens to several multicast groups on one single socket. I am disabling the socket option: IP_MULTICAST_ALL.
The socket is receiving traffic from 20 different multicast groups. This traffic arrives in bursts to the socket. One of those channels only publishes one message per second and no problem has been observed here.
I have also a reliable protocol for this multicasts feeds. If one listener misses a message then the protocol tries to recover the message by talking with the source via messages, then a retransmission is performed via the same channel as usual.
The problem appears when there are message bursts that arrived to the socket, then the RUDP protocol forces the retransmission of those messages. The messages arrive without problem but if the message-burst groups stop to retransmit new data because they don't have any more traffic to send, sometimes (it is pretty easy to reproduce it) the socket does not read those pending incoming messages from these groups if a periodic message arrives from a different group (the one that has tiny traffic, tiny and periodic traffic).
The situation up-to here is that there are many incoming messages sent before, pending to be read by the application (no more data is sent via this groups), periodic messages that arrive from the other group that sends periodically a few messages.
What I have seen here is that application reads a message from the group that periodically sends a few messages and then a batch of messages from other groups (the burst groups). The socket is configured as non-blocking and I get the EAGAIN errno everytime a batch of messages is read from the socket, then there is no more data to read until the socket gets a new message from the periodic group, then this message is read and a batch of the other pending messages from the other groups (the application is only reading from one single socket). I made sure the other groups do not produce more data because I tested stopping the other processes to send more data. So all the pending messages on these groups are already been sent.
The most surprising fact is that if I prevent the process that writes to the periodic group to send more messages, then the listener socket gets magically all the pending traffic from the groups that published a burst of messages before. It is like if the traffic of the periodic group stops somehow the processing of the traffic from the groups that do not publish new data but the buffers are plenty of it.
At first I thought it was related with IGMP or the poll mechanism (my application can perform active waiting or blocking waiting). The blocking waiting is implemented with the non-blocking socket but if the errno is set to EAGAIN then the app waits on a poll for new messages. I get the same behavior in both situations.
I don't think it is the IGMP because the IGMP_SNOOPING is off in the switches and because I reproduce the same behavior using one computer loopback for all this communications betweeen processes.
I also reproduce this behavior using kernel-bypass technologies (not using the kernel API to deal with the network), so it does not seem related to TCP/IP stack. Using kernel-bypass technologies I have the same paradigm: one message interface that gets all the traffic from all the groups. In this scenario all the processes use this mechanism to communicate, not several TCP/IP and several kernel-bypass. The model is homogeneous.
How could it be that I only receive batches of messages (but not all) when I am receiving live traffic from several groups but then I receive all the pending traffic if I stop the periodic traffic that arrives from a different multicast group?. This periodic traffic group it is only one message per second. The burst group does not publish anymore as all the messages were already published.
Please, does someone have an idea what should I check next?

Synchronizing between UDP and TCP

I'm currently implementing a daemon server that acts as 2 servers. One of the servers is recieving logs via UDP from a collection of producers. The second server is broadcasting every log that was received from a producer to a consumer who is currently connected via TCP.
These are 2 separte sockets. My current(pretty basic) implementation is to use select() on these 2 sockets, and handle every read signal accordingly, so my code is basicly(NOTE this is pseudo code)
for(;;) {
FDSET(consumers_server)
FDSET(producers_server)
select()
if consumers_server is set:
add new client to the consumers array
if producers server is set:
broadcast the log to every consumer in the array
}
This works just fine, the problem ocurres when this code is put in to stress. When multiple produers are sending logs(UDP) the real bottleneck here is the consumers which are TCP. Sending a log to the consumers can result in blocking, which i can't afford.
I've tried using non-blocking sockets and select()ing the consumers write fds, the problem is this would result in saving the non-sent logs in a buffer, until they can be sent. This results in a very unelegant massive code, and the system is also low on resources(mainly RAM)
I'm running on a linux distro.
An alternative approach to synchronize between these UDP and TCP connections would be welcomed.
This is doomed to failure. Sooner or later you will be unable to send to the TCP consumer. Whether that manifests itself as blocking or EAGAIN/EWOULDBLOCK isn't really relevant to the underlying problem, which is that the producer is overrunning the consumer. You have to decide what to do about that. You can have a certain amount of internal buffering but at some point you will have to stop reading from the UDP producers. At that point, UDP datagrams will be dropped and your system will lose data, and of course it is liable to lose data anyway by virtue of using UDP.
Don't do this. Use TCP for the producers: or else just accept the data loss and use blocking mode. Non-blocking mode only moves the problem slightly and complicates your code.

Reliable check if tcp packet has been delivered [duplicate]

When i send()/write() a message over a tcp stream, how can i find out if those bytes were successfully delivered?
The receiver acknowledges receiving the bytes via tcp, so the senders tcp stack should know.
But when I send() some bytes, send() immediately returns, even if the packet could not (yet) be delivered, i tested that on linux 2.6.30 using strace on netcat, pulling my network cable out before sending some bytes.
I am just developing an application where it is very important to know if a message was delivered, but implementing tcp features ("ack for message #123") feels awkward, there must be a better way.
The sending TCP does know when the data gets acknowledged by the other end, but the only reason it does this is so that it knows when it can discard the data (because someone else is now responsible for getting it to the application at the other side).
It doesn't typically provide this information to the sending application, because (despite appearances) it wouldn't actually mean much to the sending application. The acknowledgement doesn't mean that the receiving application has got the data and done something sensible with it - all it means is that the sending TCP no longer has to worry about it. The data could still be in transit - within an intermediate proxy server, for example, or within the receiving TCP stack.
"Data successfully received" is really an application-level concept - what it means varies depending on the application (for example, for many applications it would only make sense to consider the data "received" once it has been synced to disk on the receiving side). So that means you have to implement it yourself, because as the application developer, you're really the only one in a position to know how to do it sensibly for your application.
Having the receiver send back an ack is the best way, even if it "feels awkward". Remember that IP might break your data into multiple packets and re-assemble them, and this could be done multiple times along a transmission if various routers in the way have different MTUs, and so your concept of "a packet" and TCP's might disagree.
Far better to send your "packet", whether it's a string, a serialized object, or binary data, and have the receiver do whatever checks it needs to do to make it it's there, and then send back an acknowledgement.
The TCP protocol tries very hard to make sure your data arrives. If there is a network problem, it will retransmit the data a few times. That means anything you send is buffered and there is no timely way to make sure it has arrived (there will be a timeout 2 minutes later if the network is down).
If you need a fast feedback, use the UDP protocol. It doesn't use any of the TCP overhead but you must handle all problems yourself.
Even if it got as far as the TCP layer, there's no guarantee that it didn't sit in the application's buffer, then the app crashed before it could process it. Use an acknowledgement, that's what everything else does (e.g. SMTP)
Application layer has no control over the notifications at lower layers (such as the Transport layer) unless they are specifically provided - this is by design. If you want to know what TCP is doing on a per packet level you need to find out at the layer that TCP operates at; this means handling TCP headers and ACK data.
Any protocol you end up using to carry your payload can be used to pass messages back and forth by way of that payload, however. So if you feel awkward using the bits of a TCP header to do this, simply set it up in your application. For instance:
A: Send 450 Bytes
B: Recv 450 Bytes
B: Send 'B received 450 Bytes'
A: Recv 'B received 450 Bytes'
A: Continue
This sounds like SCTP could be something to look at; I think it should support what you want. The alternative seems to be to switch to UDP, and if you're switching protocols anyway…

how does non-blocking tcp socket notify application on packets which fail to get sent.

Im working on a non-blocking C tcp sockets for linux system. I've read that in non-blocking mode, the "send" command will return "bytes sent" immediately if there is no error. I'm guessing this value returned does not actually mean that those data have been delivered to the destination but rather the data has been passed to kernel memory for it to handle further and send.
If that is the case, how would my application know which packet has really been sent out by kernel to the other end, assuming that the network connection had some problems and kernel decides to give up only after several retries in a span of a few minutes later?
Im asking because i would want my application to resend those failed packets again at a later time.
If that is the case, how would my application know which packet has
really been sent out by kernel to the other end, assuming that the
network connection had some problems and kernel decides to give up
only after several retries in a span of a few minutes later?
Your application won't know, unless it is able to recontact the receiving application and ask the receiving application about what data it had previously received.
Keep in mind that even with blocking I/O your application doesn't block until the data is received by the remote application -- it only blocks until there is some room in the kernel's outgoing-data buffer to hold the bytes you asked the TCP stack to send(). So even with blocking I/O you would face the same issue.
Also keep in mind that the byte arrays you pass to send() do not have a guaranteed 1-to-1 correspondence to the TCP packets that the TCP stack sends out. The TCP stack is free to pack your bytes into TCP packets any way it likes (e.g. the data from multiple send() calls can end up in a single TCP packet, or the data from a single send() call can end up in multiple TCP packets, or any other combination you can think of). Depending on network conditions, TCP stacks can and do pack things various different ways, their only promise is that the bytes will be received in FIFO order (if they get received at all).
Anyway, the answer to your question is: you can't know, unless you later ask the receiving program about what it got (or didn't get).
TCP internally takes care of retrying, application doesn't need to do any special handling for it. If you wish to confirm a packet received the other end of the TCP stack then you can set the send socket buffer (setsockopt(SOL_SOCKET, SO_SNDBUF)) to zero. In this case, kernel uses your application buffer to send the data & its only released after the TCP receives acknowledgement for this data. This way you can confirm that the data is pushed to the receiver end of the TCP stack. It doesn't confirm that the application has received the data. You need to have application layer acknowledgement in your protocol to confirm that the data reached the receiver application.

recovering transmission from lost TCP connection

I am working on client server application written in C for Linux where I am replicating the data to multiple slave replicas using TCP and I would like to know how to deal with unexpected temporary shutdown of some replica (it may be the crash of the unix process or hardware power off).
When I issue the write() syscall to the kernel, the successful return means the data was copied to the socket, but doesn't mean that the receiving end got the data. If the destination is powered off and then powered up, the data must be resent (after establishing a new TCP connection) to the replica from the point where it lost the data.
Lets say I am working with large data amounts and I don't keep the data that I already sent (i.e. the write() syscall returned success). I keep only the pending data to be sent.
When the replica recovers from the unexpected shutdown and connects again, how do I get, from kernel, the data that has been written to the socket, but wasn't 'ack'-nowledged on the destination host, yet?
Or in other words, how do I recover from a loss of a TCP connection and reestablish transmission between client and server from the point where it stopped?
You need to add another level of abstraction on top of TCP. After every piece of data is sent (TCP guarantees that it will get there intact and in order), have the process at the other end send it's own kind of ACK, in your own higher level protocol (whatever that is -- be it "ACK\0", "GOT\n" or anything else). On the other side (the originator), read for this data. If it comes through good without error, everything's fine. If you get an error -- check the type. If you get ECONNRESET, that means that the remote end is dead. From this, you can respond accordingly. Wait until you can reconnect, and repeat the data send all over again.
There is no way to do what you want through the standard API.
A solution could be to have you client periodically send back a running total of bytes received and verified written to disc, and then keep a buffer of sent but not acknowledge data on the server. Then when the client reconnects, it sends it last good count, and the server knows where to start retransmitting.
TCP will take care of the sequence numbers needed for TCP, you can't make much use of those at the application level
You need some sequence control at the application level.
In your case here, you could assign a number to each block of data you sent. The destination need to keep persistent track of the last block number it has received. On startup from an unexpected shutdown, the destination need to communicate back the last blocknumber it processed, and you start sending from there.
how do I get from the kernel the data that has been written to the socket, but wasn't 'ack'-nowledged on the destination host yet?
Even if you could, this would not be enough. The destination host might very well have acked' the data, but for whatever reason the ack could be lost, or never sent, but the destination application could have received and processed that data fine. So if you use the TCP sequence number in this case, you'd end up with duplicated data.
An other case is that TCP sent back an ack for the data, and the destination application crashed/shutdown just as it read that data, but right before it wrote it to disk. So you end up with lost data.

Resources