Who take care tcp message order - c

I listen to tcp socket in linux with recv or recvfrom.
Who taking care that I will get the tcp packets on the right order?
Is that the kernel taking care so if packet 1 came after packet 2 the kernel will drop out both/save packet 2 until packet 1 will come?
Or maybe I need to take care on user-space to the order of tcp packet?

On Linux based systems in any normal scenario, this is handled by the kernel.
You can find the source code here, and here's an abridged version:
/* Queue data for delivery to the user.
* Packets in sequence go to the receive queue.
* Out of sequence packets to the out_of_order_queue.
*/
if (TCP_SKB_CB(skb)->seq == tp->rcv_nxt) {
/* packet is in order */
}
if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) {
/* already received */
}
if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt + tcp_receive_window(tp)))
goto out_of_window;
if (before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
/* Partial packet, seq < rcv_next < end_seq */
}
/* append to out-of-order queue */
tcp_data_queue_ofo(sk, skb);
along with the actual implementation that does the reordering using RB trees:

To quote Wikipedia,
At the lower levels of the protocol stack, due to network congestion, traffic load balancing, or unpredictable network behaviour, IP packets may be lost, duplicated, or delivered out of order. TCP detects these problems, requests re-transmission of lost data, rearranges out-of-order data, [...]
It's an inherent property of the protocol that you will receive the data in the correct order (or not at all).
Note that TCP is a stream protocol, so you can't even detect packet boundaries. A call to recv/recvfrom may return a portion of a packet, and it may return bytes that came from more than one packet.

Related

Is kernel did de-fragmentation

When I received packet with recv at linux, is the kernel did de-fragmentation so I will get de-fragmentation data? Or should I take care of it on user-space?
When receiving UDP data via a socket of type SOCK_DGRAM, you'll only receive the complete datagram (assuming your input buffer is large enough to receive it).
Any IP fragmentation is handled transparently from userspace.
If you're using raw sockets, then you need to handle defragmentation yourself.
ip fragmentation takes place at the IP level in the protocol stack. The layers TCP and UDP already receive recomposed packets due to fragmentation, so the TCP can receive full segments (despite that TCP tries to neve send segments that result in fragmented packets) and UPD receive full datagrams to preserve datagram boundaries.
Only if you use raw sockets you'll receive raw packets. (this meaning fragments of IP packets) TCP gives you already reliable data streams (preserving data sequence, repetitions and retransmission of missing/faulty packets) and udp gives you the datagram as it was sent, or nothing at all.
Of course, all these modules are in the kernel, so it's the kernel that recomposes packets. It's worth noting that, once a packet is fragmented in pieces, it is not recomposed until it reaches its destination, so it's the IP layer at destination who is responsible of reassembling the fragments.

Can LoadRunner Receive Data By UDP Packets?

We want to receive packets from udp sockets, the udp packets have variable length and we don't know how long they really are until we receive them (parts of them exactly, length were written in the sixth byte).
We tried the function lrs_set_receive_option with MarkerEnd only to find it has no help on this issue. The reason why we want to receive by packets is that we need to respond some packet by sending back user-defined udp packets.
Is there anybody knows how achieve that?
UPDATE
The LR version seems to be v10 or v11.
We need respond an incoming udp packet by sending back a udp packet immediately.
The udp packet may be like this
| orc code | packet length | Real DATA |
Issue is we can't let loadrunner return data for each packets, sometimes it returns many packets in a buffer, sometimes it waits until timeout though when there has been an incoming packet in the socket buffer. While in the c programming language world, when calling recvfrom(udp socket) we are returned only one udp packet per time (per call) which is want we really want.
If you need raw socket support to intercept at the packet level then you are likely going to have to jump to a DLL virtual user in Visual Studio with the raw socket support.
As to your question on UDP support: Yes, a Winsock user supports both core transport types, UDP and TCP. TCP being the more common variant as connection oriented. However, packet examination is at layer 3 of the OSI model for the carrier protocol IP. The ACK should come before you receive the dataflow for your use in the script. You are looking at assembled data flows in the data.ws when you jump to the TCP and UDP level.
Now, you are likely receiving a warning on receive buffer size mismatch which is taking you down this path with a mismatch to the recording size. There is an easy way to address this. If you take your send buffer and construct it using the lrs_set_send_buffer() function, then anything that returns will be taken as correct, ignoring the previously recorded buffer size and not having to wait for a match or timeout before continuing.

For how long do the recv() functions buffer in UDP?

My program contains a thread that waits for UDP messages and when a message is received it runs some functions before it goes back to listening. I am worried about missing a message, so my question is something along the line of, how long is it possible to read a message after it has been sent? For example, if the message was sent when the thread was running the functions, could it still be possible to read it if the functions are short enough? I am looking for guidelines here, but an answer in microseconds would also be appreciated.
When your computer receives a UDP packet (and there is at least one program listening on the UDP port specified in that packet), the TCP stack will add that packet's data into a fixed-size buffer that is associated with that socket and kept in the kernel's memory space. The packet's data will stay in that buffer until your program calls recv() to retrieve it.
The gotcha is that if your computer receives the UDP packet and there isn't enough free space left inside the buffer to fit the new UDP packet's data, the computer will simply throw the UDP packet away -- it's allowed to do that, since UDP doesn't make any guarantees that a packet will arrive.
So the amount of time your program has to call recv() before packets start getting thrown away will depend on the size of the socket's in-kernel packet buffer, the size of the packets, and the rate at which the packets are being received.
Note that you can ask the kernel to make its receive-buffer size larger by calling something like this:
size_t bufSize = 64*1024; // Dear kernel: I'd like the buffer to be 64kB please!
setsockopt(mySock, SOL_SOCKET, SO_RCVBUF, &bufSize, sizeof(bufSize));
… and that might help you avoid dropped packets. If that's not sufficient, you'll need to either make sure your program goes back to recv() quickly, or possibly do your network I/O in a separate thread that doesn't get held off by processing.

C udp recvfrom client side

I'm writing udp server/client application in which server is sending data and client
is receiving. When packet is loss client should sent nack to server. I set the socket as
O_NONBLOCK so that I can notice if the client does not receive the packet
if (( bytes = recvfrom (....)) != -1 ) {
do something
}else{
send nack
}
My problem is that if server does not start to send packets client is behave as the
packet is lost and is starting to send nack to server. (recvfrom is fail when no data is available)I want some advice how can I make difference between those cases , if the server does not start to send the packets and if it sends, but the packet is really lost
You are using UDP. For this protocol its perfectly ok to throw away packets if there is need to do so. So it's not reliable in terms of "what is sent will arrive". What you have to do in your client is to check wether all packets you need arrived, and if not, talk politely to your server to resend those packets you did not receive. To implement this stuff is not that easy,
If you have to use UDP to transfer a largish chunk of data, then design a small application-level protocol that would handle possible packet loss and re-ordering (that's part of what TCP does for you). I would go with something like this:
Datagrams less then MTU (plus IP and UDP headers) in size (say 1024 bytes) to avoid IP fragmentation.
Fixed-length header for each datagram that includes data length and a sequence number, so you can stitch data back together, and detect missed, duplicate, and re-ordered parts.
Acknowledgements from the receiving side of what has been successfully received and put together.
Timeout and retransmission on the sending side when these acks don't come within appropriate time.
you have a loop calling either select() or poll() to determine if data has arrived - if so you then call recvfrom() to read the data.
you can set time out for receive data as follows
ssize_t
recv_timeout(int fd, void *buf, size_t len, int flags)
{
ssize_t ret;
struct timeval tv;
fd_set rset;
// init set
FD_ZERO(&rset);
// add to set
FD_SET(fd, &rset);
// this is set to 60 seconds
tv.tv_sec =
config.idletimeout;
tv.tv_usec = 0;
// NEVER returns before the timeout value.
ret = select(fd, &rset, NULL, NULL, &tv);
if (ret == 0) {
log_message(LOG_INFO,
"Idle Timeout (after select)");
return 0;
} else if (ret < 0) {
log_message(LOG_ERR,
"recv_timeout: select() error \"%s\". Closing connection (fd:%d)",
strerror(errno), fd);
return;
}
ret = recvfrom(fd, buf, len, flags);
return ret;
}
It tells that if there are data ready, Normally, read() should return up to the maximum number of bytes that you've specified, which possibly includes zero bytes (this is actually a valid thing to happen!), but it should never block after previously having reported readiness.
Under Linux, select() may report a socket file descriptor as "ready
for reading", while nevertheless a subsequent read blocks. This could
for example happen when data has arrived but upon examination has
wrong checksum and is discarded. There may be other circumstances in
which a file descriptor is spuriously reported as ready. Thus it may
be safer to use O_NONBLOCK on sockets that should not block.
Look up sliding window protocol here.
The idea is that you divide your payload into packets that fit in a physical udp packet, then number them. You can visualize the buffers as a ring of slots, numbered sequentially in some fashion, e.g. clockwise.
Then you start sending from 12 oclock moving to 1,2,3... In the process, you may (or may not) receive ACK packets from the server that contain the slot number of a packet you sent.
If you receive a ACK, then you can remove that packet from the ring, and place the next unsent packet there which is not already in the ring.
If you receive a NAK for a packet you sent, it means that packet was received by the server with data corruptions, and then you resend it from the ring slot reported in the NAK.
This protocol class allows transmission over channels with data or packet loss (like RS232, UDP, etc). If your underlying data transmission protocol does not provide checksums, then you need to add a checksum for each ring packet you send, so the server can check its integrity, and report back to you.
ACK and NAK packets from the server can also be lost. To handle this, you need to associate a timer with each ring slot, and if you don't receive either a ACK or NAK for a slot when the timer reaches a timeout limit you set, then you retransmit the packet and reset the timer.
Finally, to detect fatal connection loss (i.e. server went down), you can establish a maximum timeout value for all your packets in the ring. To evaluate this, you just count how many consecutive timeouts you have for single slots. If this value exceeds the maximum you have set, then you can consider the connection lost.
Obviously, this protocol class requires dataset assembly on both sides based on packet numbers, since packets may not be sent or received in sequence. The 'ring' helps with this, since packets are removed only after successful transmission, and on the receiving side, only when the previous packet number has already been removed and appended to the growing dataset. However, this is only one strategy, there are others.
Hope this hepls.

Can a Non Blocking UDP write return with fewer bytes than requested?

I have an application that sends data point to point from a sender to the receiver over an link that can operate in simplex (one way transmission) or duplex modes (two way). In simplex mode, the application sends data using UDP, and in duplex it uses TCP. Since a write on TCP socket may block, we are using Non Blocking IO (ioctl with FIONBIO - O_NONBLOCK and fcntl are not supported on this distribution) and the select() system call to determine when data can be written. NIO is used so that we can abort out of send early after a timeout if needed should network conditions deteriorate. I'd like to use the same basic code to do the sending but instead change between TCP/UDP at a higher abstraction. This works great for TCP.
However I am concerned about how Non Blocking IO works for a UDP socket. I may be reading the man pages incorrectly, but since write() may return indicating fewer bytes sent than requested, does that mean that a client will receive fewer bytes in its datagram? To send a given buffer of data, multiple writes may be needed, which may be the case since I am using non blocking IO. I am concerned that this will translate into multiple UDP datagrams received by the client.
I am fairly new to socket programming so please forgive me if have some misconceptions here. Thank you.
Assuming a correct (not broken) UDP implementation, then each send/sendmsg/sendto will correspond to exactly one whole datagram sent and each recv/recvmsg/recvfrom will correspond to exactly one whole datagram received.
If a UDP message cannot be transmitted in its entirety, you should receive an EMSGSIZE error. A sent message might still fail due to size at some point in the network, in which case it will simply not arrive. But it will not be delivered in pieces (unless the IP stack is severely buggy).
A good rule of thumb is to keep your UDP payload size to at most 1400 bytes. That is very approximate and leaves a lot of room for various forms of tunneling so as to avoid fragmentation.

Resources