libpcap format - packet header - incl_len / orig_len - c

The libpcap packet header structure has 2 length fields:
typedef struct pcaprec_hdr_s {
guint32 ts_sec; /* timestamp seconds */
guint32 ts_usec; /* timestamp microseconds */
guint32 incl_len; /* number of octets of packet saved in file */
guint32 orig_len; /* actual length of packet */
} pcaprec_hdr_t;
incl_len: the number of bytes of packet data actually captured and saved in the file. This value should never become larger than orig_len or the snaplen value of the global header.
orig_len: the length of the packet as it appeared on the network when it was captured. If incl_len and orig_len differ, the actually saved packet size was limited by snaplen.
Can any one tell me what is the difference between the 2 length fields? We are saving the packet in entirely then how can the 2 differ?

Reading through the documentation at the Wireshark wiki ( http://wiki.wireshark.org/Development/LibpcapFileFormat ) and studying an example pcap file, it looks like incl_len and orig_len are usually the same quantity. The only time they will differ is if the length of the packet exceeded the size of snaplen, which is specified in the global header for the file.
I'm just guessing here, but I imagine that snaplen specifies the size of the static buffer used for capturing. In the event that a packet was too large for the capture buffer, this is the format's method for signaling that fact. snaplen is documented to "usually" be 65535, which is large enough for most packets. But the documentation stipulates that the size might be limited by the user.

Can any one tell me what is the difference between the 2 length fields? We are saving the packet in entirely then how can the 2 differ?
If you're saving the entire packet, the 2 shouldn't differ.
However, if, for example, you run tcpdump or TShark or dumpcap or a capture-from-the-command-line Wireshark and specify a small value with the "-s n" flag, or specify a small value in the "Limit each packet to [n] bytes" option in the Wireshark GUI, then libpcap/WinPcap will be passed that value and will only supply the first n bytes of each packet to the program, and the entire packet won't be saved.
A limited "snapshot length" means you don't see all the packet data, so some analysis might not be possible, but means that less memory is needed in the OS to buffer packets (so fewer packets might be dropped), and less CPU bandwidth is needed to copy packet data to the application and less disk bandwidth is needed to save packets to disk if the application is saving them (which might also reduce the number of packets dropped), and less disk space is needed for the saved packets.

Related

How to count bytes using send() in C, including protocol size?

I'm trying to count total send bytes from my program, but I can't get accurate value.
All my functions call a single function that send data to my server using send() function.
In this function, i get return of send() and sum into global counter. This is working fine.
But when I compare to 'iftop' utility (sudo iftop -f 'port 33755'), I'm getting more data on iftop then in my app....and my guess if because of tcp headers/protocol data. I really don't know how to calculate this. I'm sending packets using send() and variable data length, so I'm not sure if is possible to detect/calculate TCP packet size from there. I know that each TCP packet send TCP header, but I'm not sure how many packets is sent.
May I assume that every call to send(), if data length is less than 1518 (TCP packet size limite?), than it's only one TCP packet and I need to sum TCP Header length? Even if I sent one byte? If so, how much is these extra-bytes from TCP structure?!
For information: I'm using GCC on linux as compiler.
Tks!
How to count bytes using send() in C, including protocol size?
There is no reliable way to do so from within your program. You can compute a minimum total number of bytes required to transmit data of the total payload size you count, subject to a few assumptions, but you would need to monitor from the kernel side to determine the exact number of bytes.
May I assume that every call to send(), if data length is less than
1518 (TCP packet size limite?), than it's only one TCP packet and I
need to sum TCP Header length?
No, that would not be a safe assumption. The main problem is that the kernel does not necessarily match the data transferred by each send() call to its own sequence of packets. It may combine data from multiple send()s into a smaller number of packets. Additionally, however, it may use either a smaller MTU or a larger one than Ethernet's default of 1500 bytes, depending on various factors, and, furthermore, you need to fit packet headers into the chosen MTU, so the payload carried by one packet is smaller than that.
I suspect you're making this too hard. If this is a task that has been assigned to you -- a homework problem, for example -- then my first guess would be that it is intended that you count only the total payload size, not the protocol overhead. Alternatively, if you do need to account for the overhead, then my guess would be that you are supposed to estimate, based on measured or assumed characteristics of the network. If you've set this problem for yourself, then I can only say that people generally make one of the two computations I just described, not the one you asked about.

Packet reassembly at Network Layer libpcap

Environment
As per my understanding Network layer is responsible for reassembly of fragmented datagrams and then it supplies the reassembled data to upper Transport layer.
I have collected packet traces using libpcap and i want to reassemble fragmented packets at layer 3 on my own.
This link says that i need fragment flag, fragment offset, identification number and buffer value for reassembly of segments.
Question
At the arrival of first segment how to know what should be size of buffer to be initialized for complete reassembly of datagram.
Thanks.
The IP header only gives you the size of the fragment. So you need to reserve a buffer the size of the largest possible IP packet, i.e. 65535 bytes. Only once you get the last fragment can you determine the length of the complete packet.

Isn't recv() in C socket programming blocking?

In Receiver, I have
recvfd=accept(sockfd,&other_side,&len);
while(1)
{
recv(recvfd,buf,MAX_BYTES-1,0);
buf[MAX_BYTES]='\0';
printf("\n Number %d contents :%s\n",counter,buf);
counter++;
}
In Sender , I have
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
send(sockfd,mesg,(size_t)length,0);
MAX_BYTES is 1024 and length of mesg is 15. Currently, It calls recv only one time. I want recv function to be called three times for each corresponding send. How do I achieve it?
In short: yes, it is blocking. But not in the way you think.
recv() blocks until any data is readable. But you don't know the size in advance.
In your scenario, you could do the following:
call select() and put the socket where you want to read from into the READ FD set
when select() returns with a positive number, your socket has data ready to be read
then, check if you could receive length bytes from the socket:
recv(recvfd, buf, MAX_BYTES-1, MSG_PEEK), see man recv(2) for the MSG_PEEK param or look at MSDN, they have it as well
now you know how much data is available
if there's less than length available, return and do nothing
if there's at least length available, read length and return (if there's more than length available, we'll continue with step 2 since a new READ event will be signalled by select()
To send discrete messages over a byte stream protocol, you have to encode messages into some kind of framing language. The network can chop up the protocol into arbitrarily sized packets, and so the receives do not correlate with your messages in any way. The receiver has to implement a state machine which recognizes frames.
A simple framing protocol is to have some length field (say two octets: 16 bits, for a maximum frame length of 65535 bytes). The length field is followed by exactly that many bytes.
You must not even assume that the length field itself is received all at once. You might ask for two bytes, but recv could return just one. This won't happen for the very first message received from the socket, because network (or local IPC pipe, for that matter) segments are never just one byte long. But somewhere in the middle of the stream, it is possible that the fist byte of the 16 bit length field could land on the last position of one network frame.
An easy way to deal with this is to use a buffered I/O library instead of raw operating system file handles. In a POSIX environment, you can take an open socket handle, and use the fdopen function to associate it with a FILE * stream. Then you can use functions like getc and fread to simplify the input handling (somewhat).
If in-band framing is not acceptable, then you have to use a protocol which supports framing, namely datagram type sockets. The main disadvantage of this is that the principal datagram-based protocol used over IP is UDP, and UDP is unreliable. This brings in a lot of complexity in your application to deal with out of order and missing frames. The size of the frames is also restricted by the maximum IP datagram size which is about 64 kilobytes, including all the protocol headers.
Large UDP datagrams get fragmented, which, if there is unreliability in the network, adds up to greater unreliability: if any IP fragment is lost, the entire packet is lost. All of it must be retransmitted; there is no way to just get a repetition of the fragment that was lost. The TCP protocol performs "path MTU discovery" to adjust its segment size so that IP fragmentation is avoided, and TCP has selective retransmission to recover missing segments.
I bet you've created a TCP socket using SOCK_STREAM, which would cause the three messages to be read into your buffer during the first recv call. If you want to read the messages one-by-one, create a UPD socket using SOCK_DGRAM, or develop some type of message format which allows you to parse your messages when they arrive in a stream (assuming your messages will not always be fixed length).
First send the length to be received in a fixed format regarding the size of length in bytes you use to transmit this length, then make recv() loop until length bytes had been received.
Note the fact (as also already mentioned by other answers), that the size and number of chunks received do not necessarly need to be the same as sent. Only the sum of all bytes received shall be the same as the sum of all bytes sent.
Read the man pages for recvand send. Especially read the sections on what those functions RETURN.
recv will block until the entire buffer is filled, or the socket is closed.
If you want to read length bytes and return, then you must only pass to recv a buffer of size length.
You can use select to determine if
there are any bytes waiting to be read,
how many bytes are waiting to be read, then
read only those bytes
This can avoid recv from blocking.
Edit:
After re-reading the docs, the following may be true: your three "messages" may be being read all-at-once since length + length + length < MAX_BYTES - 1.
Another possibility, if recv is never returning, is that you may need to flush your socket from the sender-side. The data may be waiting in a buffer to actually be sent to the receiver.

Optimal SNAPLEN for PCAP live capture

When using pcap_open_live to sniff from an interface, I have seen a lot of examples using various numbers as SNAPLEN value, ranging from BUFSIZ (<stdio.h>) to "magic numbers".
Wouldn't it make more sense to set as SNAPLEN the MTU of the interface we are capturing from ?
In this manner, we could fit more packets at once in PCAP buffer. Is it safe to assume that the MRU is equal to the MTU ?
Otherwise, is there a non-exotic way to set the SNAPLEN value ?
Thanks
The MTU is the largest payload size that could be handed to the link layer; it does not include any link-layer headers, so, for example, on Ethernet it would be 1500, not 1514 or 1518, and wouldn't be large enough to capture a full-sized Ethernet packet.
In addition, it doesn't include any metadata headers such as the radiotap header for 802.11 radio information.
And if the adapter is doing any form of fragmentation/segmentation/reassembly offloading, the packets handed to the adapter or received from the adapter might not yet be fragmented or segmented, or might have been reassembled, and, as such, might be much larger than the MTU.
As for fitting more packets in the PCAP buffer, that only applies to the memory-mapped TPACKET_V1 and TPACKET_V2 capture mechanisms in Linux, which have fixed-size packet slots; other capture mechanisms do not reserve a maximum-sized slot for every packet, so a shorter snapshot length won't matter. For TPACKET_V1 and TPACKET_V2, a smaller snapshot length could make a difference, although, at least for Ethernet, libpcap 1.2.1 attempts, as best it can, to choose an appropriate buffer slot size for Ethernet. (TPACKET_V3 doesn't appear to have the fixed-size per-packet slots, in which case it wouldn't have this problem, but it only appeared in officially-released kernels recently, and no support for it exists yet in libpcap.)

Handling different sized packets using sockets in C

which is the beast approach to send packets that can be of different size using TCP sockets in C?
I wonder because we're trying to write a multiplayer games that needs a protocol which has many kinds of packets of different sizes.. according to recv documentation I can get how many bytes have been read but how should I manage to dispatch packets only when they are exaclty full?
Suppose that I have packets with a 5 bytes header that contains also the length of the payload.. should I use circular buffers or something like that to dispatch packets when ready and keep new partials?
Create a static variable which represents the packet header, this variable will be five bytes long. Create an associated integer which counts how many of those five bytes have yet been read. Create a second integer which counts how many bytes of the "content" have been read. Zero both those integers. Create an associated char * pointer which eventually will point to the received packet content.
As data arrives (e.g., select indicates so), read the five bytes of header. You may receive these bytes gradually, thus you need the first integer count variable. Account for the header bytes you have received here.
When you are done receiving the header, sanity check it. Are the size values possible to satisfy (e.g. not greater than 2^30)? If so, malloc a buffer of that size or that size plus the header. (If you want the header contiguous, allocate sufficient space, then memcpy it into your new buffer.)
Now, as data arrives, place it in your allocated buffer. Account for the received bytes in the second integer you created. When you have received all the bytes the header called for, then repeat all the above.
you can design a custom header for your packet transmission, which specifies packet length, indexing info (if packet fragmentation is implemented) and some hashing if you need.
some rough pseudocode as follows :
recv(socket, headerBuf, headerSize, MSG_WAITALL);
nPacketSize = headerBuf[16]; //sample
nByteRead = 0;
while (nByteRead != nPacketSize)
{
nByteToRead = nPacketSize - nByteRead;
nCurRead = recv(socket, someBuf, nByteToRead, MSG_PARTIAL);
nByteRead += nCurRead;
Sleep(0); // Yield processor
}
TCP is a stream based protocol, not a datagram one. That means that there isn't necessarily a one to one correspondence between the number and size of buffers passed to send and what comes out of receive. That means that you need to implement your own "protocol" on top of TCP.
Examples of such protocols include HTTP, where HTTP messages are delineated by two consecutive carriage return, line feed pairs. \r\n\r\n. The only concern with such delineation is to make sure that the pattern can't occur in the body of the message, either that or to make sure it is escaped. Other protocols create a header which contains the information necessary to correctly identify and read the next piece of information. I can't think of an application that does this off the top of my head. You could even go for a hybrid approach that combines the two.

Resources