How to get next TCP segment in linux kernel module?

How to get next TCP segment in linux kernel module? - c

I know I can get the pointer of TCP packet data like this:
char *data = (char *)tcphdr + 4 * tcph->doff;
But once data was segmented I cannot get full data such way. So how to get next sk_buff of next segment?
My simple code:
#include ...
static struct nf_hook_ops nfho;
unsigned int hook_funcion(void *priv, struct sk_buff *skb, const struct
nf_hook_state *state)
{
// check if it is TCP packet
char *data = (char *)tcphdr + 4 * tcph->doff;
// do something here
return NF_ACCEPT;
}
static int __init hook_init(void)
{
int ret;
nfho.hook = hook_funcion;
nfho.pf = NFPROTO_IPV4;
nfho.hooknum = NF_INET_POST_ROUTING;
nfho.priority = NF_IP_PRI_LAST;
ret = nf_register_hook(&nfho);
printk("xmurp-test start\n");
printk("nf_register_hook returnd %d\n", ret);
return 0;
}
static void __exit hook_exit(void)
{
nf_unregister_hook(&nfho);
printk("xmurp-test stop\n");
}
module_init(hook_init);
module_exit(hook_exit);

Your question is a bit complicated as there is no such thing in TCP as "full data", as TCP is a stream protocol and not a datagram protocol (in contrast of UDP). This means that there is no specific end to the data (unless the connection is closed / reset).
If you're handling an application-layer protocol which segments the TCP stream into sized messages (for example: HTTP), you should act in the following steps:
Parse the TCP payload and figure out how large this current message is.
Only then you can handle the following packets / segments as they arrive in the network stack as the continuation of the same message.
Finally, after all the data you expect have arrived, you can reassemble them and only then use their data on the application layer.
Do remember that the network works in datagrams and TCP is a stream protocol. So it might be very much possible that at the time of your first segment's handling, the rest of the data has yet to arrive. Therefore you must manage a desegmentation (defragmentation) over this and future packets over this specific stream and only then parse upper layer protocols.

Related

How to extract entire packet from skb including ethernet header, ip, and tcp plus pay load in poll method of device driver

in r8169 driver from realtek it does
rx_buf = page_address(tp->Rx_databuff[entry]);
dma_sync_single_for_cpu(d, addr, pkt_size, DMA_FROM_DEVICE);
prefetch(rx_buf);
skb_copy_to_linear_data(skb, rx_buf, pkt_size);<----//Do I get packet at this????
skb->tail += pkt_size;
skb->len = pkt_size;
dma_sync_single_for_device(d, addr, pkt_size, DMA_FROM_DEVICE);
//csum...
skb->protocol = eth_type_trans(skb, dev);
napi_gro_receive(&tp->napi, skb);
this is inside rtl_rx function called from poll of driver. I like to know in above code how can I extract the entire packet from skb at which line afterwards.
I assume at this line
skb_copy_to_linear_data(skb, rx_buf, pkt_size);
I should have a packet, but like to know the correct way I can create a kmalloc obect like
void *packet= kmalloc(....sizeof(struct ethhdr)+sizeof(struct iphdr)+sizeof(tcphdr))
and read ethernet ip and tcp headers from void *packet
How to achieve it
Or should I simple do skb_netword_header, skb_tcp_header, etc... to extract the headers and payload from skb after it get populated in above lines,
or can I simply cast as
rx_buf = page_address(tp->Rx_databuff[entry]);
struct ether_header ethhdr_of_packet=(struct eher_header *) rx_buf;
Should it work?

The highlighted line (the one with skb_copy_to_linear_data()) indeed copies entire packet data from the buffer in the driver-internal Rx ring (rx_buf) to the data buffer of the skb.
static inline void skb_copy_to_linear_data(struct sk_buff *skb,
const void *from,
const unsigned int len)
{
memcpy(skb->data, from, len);
}
Casting the rx_buf pointer to Ethernet header should be OK, too. However, the purpose of accessing the packet header like this is rather vague in the question of yours. Are you trying to just print ("dump") the packet or do you intend to copy the packet data to a completely different buffer to be consumed elsewhere?

A few related questions regarding traceroutes in c:

According to Wikipedia, a traceroute program
Traceroute, by default, sends a sequence of User Datagram Protocol
(UDP) packets addressed to a destination host[...] The time-to-live
(TTL) value, also known as hop limit, is used in determining the
intermediate routers being traversed towards the destination. Routers
decrement packets' TTL value by 1 when routing and discard packets
whose TTL value has reached zero, returning the ICMP error message
ICMP Time Exceeded.[..]
I started writing a program (using an example UDP program as a guide) to adhere to this specification,
#include <sys/socket.h>
#include <assert.h>
#include <netinet/udp.h> //Provides declarations for udp header
#include <netinet/ip.h> //Provides declarations for ip header
#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
#include <unistd.h>
#define DATAGRAM_LEN sizeof(struct iphdr) + sizeof(struct iphdr)
unsigned short csum(unsigned short *ptr,int nbytes) {
register long sum;
unsigned short oddbyte;
register short answer;
sum=0;
while(nbytes>1) {
sum+=*ptr++;
nbytes-=2;
}
if(nbytes==1) {
oddbyte=0;
*((u_char*)&oddbyte)=*(u_char*)ptr;
sum+=oddbyte;
}
sum = (sum>>16)+(sum & 0xffff);
sum = sum + (sum>>16);
answer=(short)~sum;
return(answer);
}
char *new_packet(int ttl, struct sockaddr_in sin) {
static int id = 0;
char *datagram = malloc(DATAGRAM_LEN);
struct iphdr *iph = (struct iphdr*) datagram;
struct udphdr *udph = (struct udphdr*)(datagram + sizeof (struct iphdr));
iph->ihl = 5;
iph->version = 4;
iph->tos = 0;
iph->tot_len = DATAGRAM_LEN;
iph->id = htonl(++id); //Id of this packet
iph->frag_off = 0;
iph->ttl = ttl;
iph->protocol = IPPROTO_UDP;
iph->saddr = inet_addr("127.0.0.1");//Spoof the source ip address
iph->daddr = sin.sin_addr.s_addr;
iph->check = csum((unsigned short*)datagram, iph->tot_len);
udph->source = htons(6666);
udph->dest = htons(8622);
udph->len = htons(8); //udp header size
udph->check = csum((unsigned short*)datagram, DATAGRAM_LEN);
return datagram;
}
int main(int argc, char **argv) {
int s, ttl, repeat;
struct sockaddr_in sin;
char *data;
printf("\n");
if (argc != 3) {
printf("usage: %s <host> <port>", argv[0]);
return __LINE__;
}
sin.sin_family = AF_INET;
sin.sin_addr.s_addr = inet_addr(argv[1]);
sin.sin_port = htons(atoi(argv[2]));
if ((s = socket(AF_PACKET, SOCK_RAW, 0)) < 0) {
printf("Failed to create socket.\n");
return __LINE__;
}
ttl = 1, repeat = 0;
while (ttl < 2) {
data = new_packet(ttl);
if (write(s, data, DATAGRAM_LEN) != DATAGRAM_LEN) {
printf("Socket failed to send packet.\n");
return __LINE__;
}
read(s, data, DATAGRAM_LEN);
free(data);
if (++repeat > 2) {
repeat = 0;
ttl++;
}
}
return 0;
}
... however at this point I have a few questions.
Is read(s, data, ... reading whole packets at a time, or do I need to parse the data read from the socket; seeking markers particular to IP packets?
What is the best way to uniquely mark my packets as they return to my box as expired?
Should I set up a second socket with the IPPROTO_ICMP flag, or is it easier to write a filter; accepting everything?
Do any other common mistakes exist; or are any common obstacles foreseeable?

Here are some of my suggestions (based on assumption it's a Linux machine).
read packets
You might want to read whole 1500 byte packets (entire Ethernet frame). Don't worry - smaller frames would still be read completely with read returning the length of data read.
Best way to add marker is to have some UDP payload (a simple unsigned int) should be good enough. Increase it on every packet sent. (I just did a tcpdump on traceroute - the ICMP error - does return an entire IP frame back - so you can look at the returned IP frame, parse the UDP payload and so on. Note your DATAGRAM_LEN would change accordingly. ) Of course you can use ID - but be careful that ID is mainly used by fragmentation. You should be okay with that - 'cos you'd not be approaching fragmentation limit on any intermediate routers with these packet sizes. Generally, not a good idea to 'steal' protocol fields that are meant for something else for our custom purpose.
A cleaner way could be to actually use IPPROTO_ICMP on raw sockets (if manuals are installed on your machine man 7 raw and man 7 icmp). You would not want to receive copy of all packets on your device and ignore those that are not ICMP.
If you are using type SOCKET_RAW on AF_PACKET, you will have to manually attach a link layer header or you can do SOCKET_DGRAM and check. Also man 7 packet for lot of subtleties.
Hope that helps or are you looking at some actual code?

A common pitfall is that programming at this level needs very careful use of the proper include files. For instance, your program as-is won't compile on NetBSD, which is typically quite strict in following relevant standards.
Even when I add some includes, there is no struct iphdr but there is a struct udpiphdr instead.
So for now the rest of my answer is not based on trying your program in practice.
read(2) can be used to read single packets at a time. For packet-oriented protocols, such as UDP, you'll never get more data from it than a single packet.
However you can also use recvfrom(2), recv(2) or recvmsg(2) to receive the packets.
If fildes refers to a socket, read() shall be equivalent to recv()
with no flags set.
To identify the packets, I believe using the id field is typically done, as you have already. I am not sure what you mean with "mark my packets as they return to my box as expired", since your packets don't return to you. What you may get back are ICMP Time Exceeded messages. These usually arrive within a few seconds, if they arrive at all. Sometimes they are not sent, sometimes they may be blocked by misconfigured routers between you and their sender.
Note that this assumes that the IP ID you set up in your packet is respected by the network stack you're using. It is possible that it doesn't, and replaces your chosen ID with a different one. Van Jacobson, the original author of the traceroute command as found in NetBSD therefore use a different method:
* The udp port usage may appear bizarre (well, ok, it is bizarre).
* The problem is that an icmp message only contains 8 bytes of
* data from the original datagram. 8 bytes is the size of a udp
* header so, if we want to associate replies with the original
* datagram, the necessary information must be encoded into the
* udp header (the ip id could be used but there's no way to
* interlock with the kernel's assignment of ip id's and, anyway,
* it would have taken a lot more kernel hacking to allow this
* code to set the ip id). So, to allow two or more users to
* use traceroute simultaneously, we use this task's pid as the
* source port (the high bit is set to move the port number out
* of the "likely" range). To keep track of which probe is being
* replied to (so times and/or hop counts don't get confused by a
* reply that was delayed in transit), we increment the destination
* port number before each probe.
Using a IPPROTO_ICMP socket for receiving the replies is more likely to be efficient than trying to receive all packets. It would also require fewer privileges to do so. Of course sending raw packets normally already requires root, but it could make a difference if a more fine-grained permission system is in use.

Sending fragmented datagram with UDP header on every fragment

I am working with an embedded box that must be able to communicate with traditional computers using UDP. When the box sends large UDP messages (that need to be fragmented), a UDP header is included for each fragment. Thus if I want to a send a large datagram, it will be fragmented like this:
[eth hdr][ip hdr][udp hdr][ data 1 ] /* first fragment */
[eth hdr][ip hdr][udp hdr][ data 2 ] /* second fragment */
[eth hdr][ip hdr][udp hdr][ data 3 ] /* last fragment */
I understand that this is not customary, as usually the udp header would only be included in only the first ip packet of the fragmented message. However, this works perfectly for communicating with the other machines I need to talk to (ex. using recvfrom), so I have no reason to dig in and try to change it.
My issue, however, is in reading messages. The box seems to expect fragmented udp datagrams to be sent to it in the same manner. By this I mean that it expects every ipv4 fragment to have a udp header. Before trying to change this (it's a rather specialized and complicated platform) I would like to know if there is any way to configure sendto() or any other such function for sending udp messages in this format. I see when monitoring the traffic that those udp headers aren't present.
Thank you very much for the help.

No. Socket's don't work this way. Just write your own sendto wrapper to manually fragment the frames across multiple UDP packets on whatever buffer size boundary you choose. This will achieve the desired effect that you want.
Sample code as follows:
ssize_t fragmented_sendto(int sockfd, const void *buf, size_t len, int flags,
const struct sockaddr *dest_addr, socklen_t addrlen, size_t MAX_PACKET_SIZE)
{
unsigned char* ptr = (unsigned char*) buf;
size_t total = 0;
while (total <= len)
{
size_t newsize = len - total;
if (newsize > MAX_PACKET_SIZE)
{
newsize = MAX_PACKET_SIZE;
}
ssize_t result = sendto(sockfd, ptr, newsize, flags, dest_addr, addrlen);
if (result < 0)
{
// handle error
return -1;
}
else
{
total += result;
ptr += result;
}
}
return (ssize_t)total;
}

how to calculate udp packet size libpcap

From a linux OS I am trying to write my own data usage monitor in C or python. I've searched and researched for a couple of days now. Currently I am trying to adapt sniffex.c to suit my needs. I've succeeded in verifying the total bytes sent and received during a few ftp sessions.
In sniffex.c the tcp packet size is calculated. My question is how do you calculate the UDP packet size? I've searched on this topic, but have not found anything. Does this question make sense?
Update:
The function where the packet sizes are computed looks like this:
got_packet(u_char *args, const struct pcap_pkthdr *header, const u_char *packet)
{
...
int size_payload;
...
case IPPROTO_UDP:
printf(" Protocol: UDP\n");
size_payload = header->len;
...
}
Do I still need to add 4 to size_payload?
The callback to this function looks like this:
/* now we can set our callback function */
pcap_loop(handle, num_packets, got_packet, NULL);

If you have an UDP datagram, you can get its size from its header.
For example if you have a pointer char *hdr to UDP datagram header, you can get its length by such a construction
int length = *(int *)(hdr + 4);
More about UDP http://en.wikipedia.org/wiki/User_Datagram_Protocol

In Linux's net/ipv4/udp.c, why does a UDP packet need to be processed by xfrm4_policy_check()?

int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) {
struct udp_sock *up = udp_sk(sk);
int rc;
int is_udplite = IS_UDPLITE(sk);
/*
* Charge it to the socket, dropping if the queue is full.
*/
if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
goto drop;
nf_reset(skb);
I'm reading the code in Linux net/ipv4/udp.c. Can anyone explain to me why an UDP packet need to run through xfrm_policy_check()?
As far as I know the function return:
true: non-IPsec packet / valid IPsec packet
false: invalid IPsec packet
I might have misunderstood the function return value, as do not entirely understand the source code.

xfrm4_policy_check function checks the packet against IPsec policies. The return value of this function is 1 if the packet is allowed to be processed, and zero if it is not. For example, IPsec might decide to drop the packet if skb->ip_summed is not set to CHECKSUM_UNNECESSASRY and packet fails a checksum.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to get next TCP segment in linux kernel module? - c

Related

How to extract entire packet from skb including ethernet header, ip, and tcp plus pay load in poll method of device driver

A few related questions regarding traceroutes in c:

Sending fragmented datagram with UDP header on every fragment

how to calculate udp packet size libpcap

In Linux's net/ipv4/udp.c, why does a UDP packet need to be processed by xfrm4_policy_check()?

Categories

Resources