c: reordering TCP packets based on SEQ field - c

I processed a number of pcap files, extracted individual packets and arranged them in a per-flow data structure (flow table) as a list of packets. Now, since packets may arrive out-of-order, I need to re-order them accordingly. My main criteria is sequence number field from then TCP header; I guess this is the only way to re-order TCP packets?
When extracting packets from pcap files I already read SEQ value and stored in my per-packet structure, as well as next (expected) sequence value, as pkt->seq + TCP_segment_size.
Now, what would be the right approach to do this? I could sort the list of packets in a flow per SEQ value, but it will be very slow.
Also: do I need to obtain the ISN (initial sequence number) before start re-ordering? (basically the 1st packet with SYN bit set will have ISN in seq field of TCP header) The problem is that such packet may be missing (example: started capturing after handshake completed.)
How is re-ordering normally done?

Related

How to count bytes using send() in C, including protocol size?

I'm trying to count total send bytes from my program, but I can't get accurate value.
All my functions call a single function that send data to my server using send() function.
In this function, i get return of send() and sum into global counter. This is working fine.
But when I compare to 'iftop' utility (sudo iftop -f 'port 33755'), I'm getting more data on iftop then in my app....and my guess if because of tcp headers/protocol data. I really don't know how to calculate this. I'm sending packets using send() and variable data length, so I'm not sure if is possible to detect/calculate TCP packet size from there. I know that each TCP packet send TCP header, but I'm not sure how many packets is sent.
May I assume that every call to send(), if data length is less than 1518 (TCP packet size limite?), than it's only one TCP packet and I need to sum TCP Header length? Even if I sent one byte? If so, how much is these extra-bytes from TCP structure?!
For information: I'm using GCC on linux as compiler.
Tks!
How to count bytes using send() in C, including protocol size?
There is no reliable way to do so from within your program. You can compute a minimum total number of bytes required to transmit data of the total payload size you count, subject to a few assumptions, but you would need to monitor from the kernel side to determine the exact number of bytes.
May I assume that every call to send(), if data length is less than
1518 (TCP packet size limite?), than it's only one TCP packet and I
need to sum TCP Header length?
No, that would not be a safe assumption. The main problem is that the kernel does not necessarily match the data transferred by each send() call to its own sequence of packets. It may combine data from multiple send()s into a smaller number of packets. Additionally, however, it may use either a smaller MTU or a larger one than Ethernet's default of 1500 bytes, depending on various factors, and, furthermore, you need to fit packet headers into the chosen MTU, so the payload carried by one packet is smaller than that.
I suspect you're making this too hard. If this is a task that has been assigned to you -- a homework problem, for example -- then my first guess would be that it is intended that you count only the total payload size, not the protocol overhead. Alternatively, if you do need to account for the overhead, then my guess would be that you are supposed to estimate, based on measured or assumed characteristics of the network. If you've set this problem for yourself, then I can only say that people generally make one of the two computations I just described, not the one you asked about.

Accessing TCP header fields (without raw socket API)

I am writing an application that needs to get access to TCP header fields, for example, a sequence number or a TCP timestamp field.
Is it possible to get sequence numbers (or other header fields) by operating at the socket API without listening on a raw socket? (I want to avoid filtering out all the packets).
I am looking at the TCP_INFO but it has a limited information.
For example, after calling a recvmsg() and getting a data buffer, is it possible to know the sequence number of the segment that delivered the last byte in that received data buffer?
Thanks
You can try to use libpcap to capture packets. This lib allows to specify packet filter using the same syntax as in Wireshark, so you could limit captured packets to one connection only. One downside is that you would have to receive packets in normal way too, what complicated things a bit and is an additional performance overhead.
Update: you can also open raw socket and set Berkeley Packet Filter on it using socket option SO_ATTACH_FILTER. More details are here: https://www.kernel.org/doc/Documentation/networking/filter.txt . However you would have to implement TCP part of IP stack in your code too.

How to identify initial packet in TCP 3-way handshake?

Is it true that the Acknowledgment Number (please note I'm not talking about the ACK flag here) is set to 0 when a client initiates the 3-way TCP handshake by sending its initial packet?
I have a TCP trace file and I used the pcap library in C to print out specific information for every packet. What I noticed was that the Acknowledgment Number of the very first packet in a connection is always set to 0. Could I use that as criteria in identifying the first packet of a TCP session?
If not that, what other criteria can I use to identify a given packet as being the first one sent by the web client? Simply looking at the SYN flag won't work since when the server responds to the initial request by the client, it will also set its SYN flag.
The sequence number will have a random value, but it is completely normal behavior for the acknowledgement (which you ask about) number field to contain 32 bits of zeroes in it.
This isn't to say that it can't contain data. You rightly distinguish the ACK flag from the acknowledgement number. The actual meaning of the flag is to signal that the value in the acknowledgement number field is valid. Since it would be cleared on the initial SYN, there is no such claim. As such, it can contain absolutely anything, though again, normal behavior is zero.
As to your question in distinguishing the initial SYN from the response, the response to a properly formed initial SYN by a normal IP stack will be SYN-ACK. So, while the SYN will be set, the ACK will also be set. To distinguish one from the other the better practice would be to look at the TCP code bits field rather than the sequence numbers unless you are trying to do some kind of anomaly detection.

What is happening when a TCP sequence number arrives that is not what is expected?

I am writing a program that uses libpcap to capture packets and reassemble a TCP stream. My program simply monitors the traffic and so I have no control over the reception and transmittal of packets. My program disregards all non TCP/IP traffic.
I calculate the next expected sequence number from the ISN and then the successive SEQ numbers. I have it set up so that every TCP connection is uniquely identified by a tuple made up of the source IP, source port, dest IP, and dest port. Everything goes swimmingly until I receive a packet that has a sequence number different than what I am expecting. I have uploaded screen shots to help illustrate what I am describing here.
My questions are:
1. Where is the data that was in the "lost" packet?
2. How does the SEQ number order recover from this situation?
3. What can I do to handle these occurrences.
Please remember; however, I am not writing a program that adheres to TCP. I am writing a program that passively monitors network traffic for TCP streams and attempts to save the raw data to disk, and I am confused as to why the above state situation happens and how I can program to handle it.
Thank you
Where is the data that was in the "lost" packet?
It got dropped by someone
It got lost on the way (wrong detour) and will arrive later
How does the SEQ number order recover from this situation
The receiver notices the segment is out of sequence and doesn't send it to the application, thereby fulfilling its contract: in-order reliable byte stream. Now, what actually happens to get the missing piece is quite intricate and varies from stack to stack. In a nutshell the stack waits for the missing piece to arrive.
The receiver can throw away out-of-sequence segments or it can queue them in a reassembly queue
The receiver can wait for the missing segment to arrive or it can immediately send the ACK it already sent before. Duplicate ACKs will alert the peer something is wrong (look for Fast Retransmit)
When sending acknowledgments the TCP can inform the peer some segments arrived successfully - they're just out of sequence (SACK)
What can I do to handle these occurrences
You can't do anything since you're only monitoring. You could probably get more insight into what is really happening if you also captured the response traffic.
Depending on the window-size of the current TCP connection, if the new packet fits within the receiving window (multi-packet buffer) it will be entered into the receiving queue (and reordered for ordered delivery to protocol clients).
If the sequence number is larger than the maximum for the current window, the packet gets rejected.
See also section 4.4.2 (INPUT PACKET HANDLER) in RFC 675

Opening win-socket (tcp) in kernel mode specifying sequence number

I'm writing a windows driver (of course in c and I'm in kernel mode) and I'd like to open a tcp socket from the outside specifying the sequence number the first SYN packet should have.
I tried modifying the packet filtering it with Windows Filtering Platform, but of course it doesn't work because the stack think that the correct number is the original one and the recipient's stack think that the correct one is modified one.
I'm looking somethink like:
OpenSocket(..., UINT32 seqNum, UINT16 winSize)
or anything equivalent.
There is a way to do that?
Thanks,
Marco
Seems like a strange thing to be doing, but if your filter can modify both incoming and outgoing packets then it can fix the sequence number in both directions.
Just figure out the offset from the orignal sequence number. Then you can add it to the sequence number for outgoing packets and subtract it from the acknowledgment numbers for incoming packets.
Each side of the conversation gets exactly what they expect, even though they disagree on what is expected.

Resources