Parse ethernet, IP and TCP headers - c

I would like to understand how it's possible to iterate over a packet collected with pcap.
#include <pcap.h>
#include <netinet/if_ether.h>
#include <netinet/ip.h>
#include <netinet/tcp.h>
void analyse(struct pcap_pkthdr *header, const unsigned char *packet, int verbose) {
/** Ethernet header has a fixed value, IP header and TCP header don't **/
ip_size = sizeof( struct ip );
tcp_size = sizeof( struct tcphdr );
/* Assign each pointer its correct value **/
const struct ether_header *ethernet = ( struct ether_header* ) packet;
const struct ip *ip = (struct ip*) ( packet + ETH_HLEN );
const struct tcp *tcp = (struct tcphdr*) (packet + ETH_HLEN + ip_size );
const char *payload = ( packet + ETH_HLEN + ip_size + tcp_size );
}
Can I be sure that the ethernet, ip, tcp,payload`respectively point to:
First bit of the Data link Layer (Ethernet header)
First bit of the Network Layer (IP header)
First bit of the Transport Layer (TCP header)
First bit of the payload
Thanks,

No, you cannot. I assume you are talking about the first version of the PCAP standard. There is another one called pcap-ng (next generation).
https://www.ietf.org/staging/draft-tuexen-opsawg-pcapng-02.html
File header
On the start of PCAP file there is a fixed size header. The last field in that header is the data link type.
https://tools.ietf.org/id/draft-gharris-opsawg-pcap-00.html
Hopefully the link type LINKTYPE_ETHERNET value is 0x1. If it is not, you can basically throw away the entire file.
https://www.tcpdump.org/linktypes.html
Ethernet header
So now we know that every packet in the file will the of the type ethernet. You skip the ethernet header just adding sizeof(ethhdr).
After the ethernet header the IP header might not be the next layer. You have to process the protocols as they show up in the chain.
For example, VLAN headers are placed right after the ethernet header, sometimes multiple VLAN headers in chain so you have to skip all them. This would be indicated by ethernet->h_proto==ETH_P_8021Q.
VLAN is one of them. There are other protocols and that's where it gets really complex to write something very generic to parse all them.
In 99.9% of the cases you can assume the ethernet header plus the potentially multiple VLAN headers but you have to skip each of them in sequence until ethernet->h_proto==ETH_P_IP or an unknown protocol is found, in which case you bail out.
IP Headers
If it is IPv4, skip the standard IP header and then the IP options. The IP header has variable size, although most of the time it is fixed. This is due to the options part of the IP header. What you see in the struct iphdr is only the fixed part.
You have to account for skipping the options part too so you have to add (ip->ihl & 0xF) * 4 and which will be typically equal to sizeof(iphdr) if ip->ihl&0xF is 5, which is almost always the case.
TCP header
The TCP header also contains options but in the fixed part of the header you have a count of how many 32-bit blocks the entire TCP header has. Just skip the entire header by adding tcp->th_off*4.
5 years later this is responded.

Related

Get IP version from packet data

The pcap callback function returns the IP header and data as follows:
void packet_handler(u_char* param, const struct pcap_pkthdr* header, const u_char* pkt_data);
My understanding is the first 4 bits of the pkt_data is the IP version from which I can determine it is is IPv4 or IPv6. However, I've tried a few different ways to read the first 4 bits and I'm getting data that does not make sense.
For example, I defined the following structure:
struct ipdata {
u_char version : 4;
u_char dontcare : 4;
};
And then I tried to get the ip version using this code:
ipdata* pipdata;
pipdata = (ipdata*) pkt_data;
ip_ver = pipdata->version;
printf(" %d ", ip_ver);
The above method prints values of 3, 6, 9, 8 and 12. If I watch the traffic at the same time in Wireshark I see that most of the packets are IPv6.
Could someone who has done this clarify how would I go about reading the IP version?
Figure out the answer. Npcap returns the entire ethernet packet, so the first 14 bytes are the Ethernet header:
/* Length of the Ethernet Header (Data Link Layer) */
#define ETHERNET_HEADER_LEN 14
/* Ethernet addresses are 6 bytes */
#define ETHER_ADDR_LEN 6
/* Ethernet header */
struct sniff_ethernet {
u_char ether_dhost[ETHER_ADDR_LEN]; /* Destination host address (i.e. Destination MAC Address) */
u_char ether_shost[ETHER_ADDR_LEN]; /* Source host address (i.e. Source MAC Address) */
u_short ether_type; /* IP? ARP? RARP? etc */
};
You can figure out whether it is an IPv4 or IPv6 packet by looking at the ether_type in the above structure rather than the version in the IP header, such as:
/* Common ethernet types in Hex*/
#define ETHERNET_TYPE_IPv4 0x0800
#define ETHERNET_TYPE_IPv6 0x86DD
u_short eth_type;
ethernet = (struct sniff_ethernet*)(pkt_data);
eth_type = ntohs(ethernet->ether_type);
if (eth_type == ETHERNET_TYPE_IPv4) {
ipv4_handler(pkt_data);
}
else if (eth_type == ETHERNET_TYPE_IPv6)
{
ipv6_handler(pkt_data);
}
The IP header starts right after the ethernet header, so you can get it with code such as the following example for an IPv6 packet:
/* IPv6 header */
typedef struct ipv6_header
{
unsigned int
version : 4,
traffic_class : 8,
flow_label : 20;
uint16_t length;
uint8_t next_header;
uint8_t hop_limit;
struct in6_addr saddr;
struct in6_addr daddr;
} ipv6_header;
const ipv6_header* iph;
iph = (ipv6_header*)(pkt_data + ETHERNET_HEADER_LEN);
From there you can access the version and other information about the IP header. See this post for more information: Getting Npcap IPv6 source and destination addresses

PCAP Coding :: My Code is Setting the Wrong Type of Ethernet

I’m writing a C program which builds an Ethernet/IPv4/TCP network packet, then writes the packet into a PCAP file for inspection. I build my code off the SO post here. The first version of my code worked perfectly, but it was one big main() function, and that is not portable into larger programs.
So I reorganized the code so I could port it into another program. I don’t want to get into the differences between Version 1 and Version 2 in this post. But needless to say, Version 2 works great, except for one annoying quirk. When Wireshark opened a Version 1 PCAP file, it saw that my Layer 2 was Ethernet II:
Frame 1: 154 bytes on wire (1232 bits), 154 bytes captured (1232 bits)
Ethernet II, Src: 64:96:c8:fa:fc:ff (64:96:c8:fa:fc:ff), Dst: Woonsang_04:05:06 (01:02:03:04:05:06)
Destination: Woonsang_04:05:06 (01:02:03:04:05:06)
Source: 64:96:c8:fa:fc:ff (64:96:c8:fa:fc:ff)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.10.10.10, Dst: 20.20.20.20
Transmission Control Protocol, Src Port: 22, Dst Port: 55206, Seq: 1, Ack: 1, Len: 100
SSH Protocol
But in Version 2, the Layer 2 header became 802.3 Ethernet:
Frame 1: 154 bytes on wire (1232 bits), 134 bytes captured (1072 bits)
IEEE 802.3 Ethernet
Destination: Vibratio_1c:08:00 (00:09:70:1c:08:00)
Source: 45:00:23:28:06:cf (45:00:23:28:06:cf)
Length: 64
Trailer: 050401040204000001020506040400070602040704060202…
Logical-Link Control
Data (61 bytes)
[Packet size limited during capture: Ethernet truncated]
I’m no expert in networking, but I’m guessing my Version 2 PCAP file is malformed somewhere. I should not have a Logical-Link Control header in there; my code thinks it is writing Ethernet II / IPv4 / TCP headers. At this point, my instinct is that either the PCAP Packet header (necessary to proceed every packet in a PCAP file) or my Ethernet header is incorrect, somehow. Which would tell Wireshark “the next X bytes are an Ethernet II header?"
Here’s my code, in excerpts:
The structs for the PCAP header and Ethernet frames were cribbed directly from the before-mentioned SO post. The solution in that post was to use the pcap_sf_pkthdr struct for the PCAP Packet header:
// struct for PCAP Packet Header - Timestamp
struct pcap_timeval {
bpf_int32 tv_sec; // seconds
bpf_int32 tv_usec; // microseconds
};
// struct for PCAP Packet Header
struct pcap_sf_pkthdr {
struct pcap_timeval ts; // time stamp
bpf_u_int32 caplen; // length of portion present
bpf_u_int32 len; // length this packet (off wire)
};
And the Ethernet header is from the original post:
// struct for the Ethernet header
struct ethernet {
u_char mac1[6];
u_char mac2[6];
u_short protocol; // will be ETHERTYPE_IP, for IPv4
};
There’s not much to either struct, right? I don’t really understand how Wireshark looks at this and knows the first 20 bytes of the packet are Ethernet.
Here’s the actual code, slightly abridged:
#include <netinet/in.h> // for ETHERTYPE_IP
struct pcap_sf_pkthdr* allocatePCAPPacketHdr(struct pcap_sf_pkthdr* pcapPacketHdr ){
pcapPacketHdr = malloc( sizeof(struct pcap_sf_pkthdr) );
if( pcapPacketHdr == NULL ){
return NULL;
}
uint32_t frameSize = sizeof( struct ethernet) + …correctly computed here
bzero( pcapPacketHdr, sizeof( struct pcap_sf_pkthdr ) );
pcapPacketHdr->ts.tv_sec = 0; // for now
pcapPacketHdr->ts.tv_usec = 0; // for now
pcapPacketHdr->caplen = frameSize;
pcapPacketHdr->len = frameSize;
return pcapPacketHdr;
}
void* allocateL2Hdr( packetChecklist* pc, void* l2header ){
l2header = malloc( sizeof( struct ethernet ) );
if( l2header == NULL ){
return NULL;
}
bzero( ((struct ethernet*)l2header)->mac1, 6 );
bzero( ((struct ethernet*)l2header)->mac2, 6 );
// …MAC addresses filled in later…
((struct ethernet*)l2header)->protocol = ETHERTYPE_IP; // This is correctly set
return l2header;
}
...and the code which uses the above functions...
struct pcap_sf_pkthdr* pcapPacketHdr;
pcapPacketHdr = allocatePCAPPacketHdr( pcapPacketHdr );
struct ethernet* l2header;
l2header = allocateL2Hdr( l2header );
Later, the code populates these structs and writes them into a file, along with an IPv4 header, a TCP header, and so on.
But I think my problem is that I don’t really understand how Wireshark is supposed to know that my Ethernet header is Ethernet II and not 802.3 Ethernet with an Logical-Link Header. Is that communicated in the PCAP Packet Header? Or in the ethernet frame somewhere? I’m hoping for advice. Thank you
Wireshark is supposed to know that my Ethernet header is Ethernet II and not 802.3 Ethernet with an Logical-Link Header. Is that communicated in the PCAP Packet Header?
No.
Or in the ethernet frame somewhere?
Yes.
If you want the details, see, for example, the "Types" section of the Wikipedia "Ethernet frame" page.
However, the problem appears to be that the packet you're writing to the file doesn't have the full 6-byte destination and source addresses in it - the last two bytes of the destination address are 0x08 0x00, which are the first two bytes of a big-endian value of ETHERTYPE_IP (0x0800), and the first byte of the source address is 0x45, which is the first byte of an IPv4 header for an IPv4 packet with no IP options.
Somehow, Version 1 of your program put the destination and source addresses into the data part of the pcap record, but Version 2 didn't.

How can I bypass vlan header when I read pcap in C?

I have followed the code in here and fixed the issue for printing out IP address. I perfectly worked when it reads a captured file from my machine and the results are the same with tcpdump. However, when I read another pcap file (captured from the boundary router of a big network), it gives me totally different IP addresses. I found these pcap contains VLAN in the ethernet frames. How can detect if a packet contains a vlan header?
You'd have to examine the physical layer protocol (Most likely ethernet nowadays) and determine the ethernet type (the 13th and 14th bytes of the ethernet header).You can view an example list of possible ethernet types here.
If the type is 0x0800 (IPv4) then everything should work as expected.
However, If the ethertype is 0x8100 (802.1Q) you'd have to extract the actual payload type from the VLAN header (the 17th and 18th bytes)
Here is a very crude code to bypass the upper layers starting from a base address pointing at the ethernet beginning
char *get_ip_hdr(char *base) {
// If frame is not ethernet retun NULL
uint16_t ether_type = ntohs(*(uint16_t *) (base + 12));
if (ether_type == 0x0800 ) {
return base + 14;
} else if (ether_type == 0x8100 ) {
// VLAN tag
ether_type = ntohs(*(uint16_t *) (base + 16));
if (ether_type == 0x800) {
return base + 18;
}
}
return NULL
}
Note be wary of double VLAN tagging and take the necessary similar steps to skip it as well.

TCPDump / libpcap - find memory location of payload data

I am trying to view http traffic going to and from my loopback network adapter using libpcap. I just beginning with network programming and completely new to this library. Thanks to an answer I received previously I have been successful at detecting the link-layer type on my machine's "lo0" adapter (Mac OSx).
//lookup link-layer header type
link_layer_type = pcap_datalink(handle);
if(link_layer_type == DLT_NULL){
printf("DLT_NULL"); // this true in the case of "lo0"
}
The Programming with Pcap guide makes the assumption that each packet will contain an ethernet header. So the logic used to find a packet's payload is as follows:
ethernet = (struct sniff_ethernet*)(packet);
ip = (struct sniff_ip*)(packet + SIZE_ETHERNET);
size_ip = IP_HL(ip)*4;
if (size_ip < 20) {
printf(" * Invalid IP header length: %u bytes\n", size_ip);
return;
}
tcp = (struct sniff_tcp*)(packet + SIZE_ETHERNET + size_ip);
size_tcp = TH_OFF(tcp)*4;
if (size_tcp < 20) {
printf(" * Invalid TCP header length: %u bytes\n", size_tcp);
return;
}
}
payload = (u_char *)(packet + SIZE_ETHERNET + size_ip + size_tcp);
This logic is clearing not going to work when inspecting the contents of packet originating from the loopback interface where an ethernet header does not exists. The Link-Layer Header Types documentation states that a Link-Layer type of "DTL_NULL" contains a 4 byte header which consist of a PF_ value containing the network-layer protocol (I'm guess IPv4 in my case).
Given the above information.. how can I properly locate the packet's payload location?
Any guidance or information would be very appreciated. Thanks!
Given the above information.. how can I properly locate the packet's payload location?
For DLT_NULL, your program should extract the first 4 bytes of the packet data as a 32-bit number. If you're doing a live capture, you can extract it in the host's byte order and compare it against your OS's values of AF_INET and AF_INET6 (if it has an AF_INET6 definition; these days, most current OS versions should, as they should support IPv6); if you're reading a capture file, you'd need to byte-swap the value if pcap_is_swapped() returns a non-zero value (you can also use it for live captures; it always returns zero for live captures), and you'll need to compare against several different "IPv6" values (24, 28, and 30), each of which mean "IPv6" on some particular OS (fortunately, AF_INET is 2 on all OSes that support DLT_NULL, as they all took that value from 4.2BSD).
If the value is the IPv4 value (2, as per the above), then after those 4 bytes you have the IPv4 header for the packet. If it's one of the IPv6 values, then after those 4 bytes you have the IPv6 header for the packet. If it's not any of those values, it's some other protocol.

How do I get uri of HTTP packet with winpcap? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to hijack all local http request and extract the url using c?
Based on this article I can get all incoming packets.
/* Callback function invoked by libpcap for every incoming packet */
void packet_handler(u_char *param, const struct pcap_pkthdr *header, const u_char *pkt_data)
{
struct tm *ltime;
char timestr[16];
ip_header *ih;
udp_header *uh;
u_int ip_len;
u_short sport,dport;
time_t local_tv_sec;
/* convert the timestamp to readable format */
local_tv_sec = header->ts.tv_sec;
ltime=localtime(&local_tv_sec);
strftime( timestr, sizeof timestr, "%H:%M:%S", ltime);
/* print timestamp and length of the packet */
printf("%s.%.6d len:%d ", timestr, header->ts.tv_usec, header->len);
/* retireve the position of the ip header */
ih = (ip_header *) (pkt_data +
14); //length of ethernet header
/* retireve the position of the udp header */
ip_len = (ih->ver_ihl & 0xf) * 4;
uh = (udp_header *) ((u_char*)ih + ip_len);
/* convert from network byte order to host byte order */
sport = ntohs( uh->sport );
dport = ntohs( uh->dport );
/* print ip addresses and udp ports */
printf("%d.%d.%d.%d.%d -> %d.%d.%d.%d.%d\n",
ih->saddr.byte1,
ih->saddr.byte2,
ih->saddr.byte3,
ih->saddr.byte4,
sport,
ih->daddr.byte1,
ih->daddr.byte2,
ih->daddr.byte3,
ih->daddr.byte4,
dport);
}
But how do I extract URI information in packet_handler?
You're not following the best example. The URL you posted is an example that handles UDP packets but HTTP is based on TCP.
Not every packet has a URI.
In an http request, the URI will be transmitted very near the beginning of the connection, but subsequent packets are just pieces of the larger request.
To find the URI (and all data) being requested, look in the pkt_data.
Usually (ignoring Connection: keep-alive, very short first packet etc.) the URI would be the second word on the first line of the first outgoing TCP packet (defining words as space delimited, lines as CR LF delimited).
As wireshark is based on libpcap, is open source and does a pretty good job of this, you can start from looking there.

Resources