Sending SKB's for transmission from kernel space - c

I am currently writing a kernel module that modifies packet payloads as a learning experience. I have the packet modifications done, but now I want to send out this new modified packet after the original (I don't want to drop the original). I can't seem to find a kernel function that sends SKB's for transmission. I've tried dev_queue_xmit(nskb) but that causes a kernel panic, I also tried skb->next = nskb but that does nothing. Do I have to implement the SKB list handling? I am unsure of how to do that since this article seems to be outdated .
EDIT:
So I was able to fix the kernel panic when calling dev_queue_xmit(nskb), I was accidentally doing dev_queue_xmit(skb) which would delete skb and cause a panic from net filter. The problem is now that everything works, but I'm not seeing duplicate packets being sent out, there is no trace of the second packet ever being sent. TCPDump on the machine doesn't see anything and TPCDump on the target doesn't see anything either, the following is my code.
unsigned int in_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) {
struct sk_buff *nskb = skb_copy(skb, GFP_KERNEL);
/* Various other variables not relevant to the problem */
__u32 saddr, daddr;
saddr = ntohl(iph->saddr);
if (saddr == ipToInt(10,0,2,12) || saddr == ipToInt(10,0,2,13)) {
/*For loop that saves the payload contents into a variable */
/* Here is where the problem is,
I have this if statement to prevent a feedback loop
then if the ip matches, I call dev_queue_xmit(nskb)
which is supposed to send out sk_buff's, but TCPDump doesn't
show anything on any computer */
if (saddr == ipToInt(10,0,2,13)) {
dev_queue_xmit(nskb);
}
/* Rest of the code that isn't relevant to sending packets */
}
return NF_ACCEPT;
}
My network setup is as follows, it's 3 Ubuntu Server VM's, all of them are being SSH'd into from the host computer (macOS if it matters, I don't know at this point). The computer running the above kernel module spoofs bidirectionally the other two VM's. The other two VM's then talk to each other via a netcat session. I'm hoping that when I send one message from the VM with ip 10.0.2.13, that 10.0.2.12 sees two of the same message. I know the acknowledgement number mishap will break the connection, but I'm not getting that. TCPDump on any of the 3 computers doesn't show anything besides the packets that are supposed to be sent.
I have so far tried dev_queue_xmit(nskb) as well as nskb->dev->netdev_ops->ndo_start_xmit(nskb, skb->dev).

As far as I remember dev_queue_xmit() is the right procedure for sending. The question is how do you prepared the skb you want to send? Also give us the calltrace from dmesg when the kernel panic occured. Do you set skb->dev?

I figured it out, skb_copy doesn't copy the ethernet header of an skb, so the sent packet never reaches its destination.

Related

Kernel Module UDP socket drops datagrams although consumed

The Situation
I am currently writing a kernel module, that should handle a custom network protocol based on UDP.
What I do (in rather pseudocode) is, I create a UDP socket with
sock_create(AF_INET, SOCK_DGRAM, IPPROTO_UDP, &sk);
where sk is the socket pointer.
So, instead of actively polling the kernel for new UDP data, I registered the data ready callback with
sk->sk_data_ready = myudp_data_ready;
And here is the full code of the myudp_data_ready function:
void myudp_data_ready(struct sock *sk) {
struct sk_buff *skb;
int err;
if ((skb = skb_recv_datagram(sk, 0, 1, &err)) == NULL) {
goto Bail;
}
// ...
// HERE, MY CUSTOM UDP-BASED PROTOCOL WILL BE IMPLEMENTED
// ...
skb_free_datagram(sk, skb);
return;
Bail:
return;
}
The Problem
The problem now is, that I get all UDP packets perfectly fine.
The skb_recv_datagram function is returning a socket buffer that I can handle.
However, after some time, it stops working.
What I already tried
In /proc/net/udp I can see, that the rx_queue is growing until it's full and than packets are dropped.
This is, where I do not get any packets anymore in my code (obviously).
This seems odd.
If I understood correctly, the Kernel uses a reference count in socket buffers.
If this count drops to 1, the buffer is free'd and unlinked from the receive queue.
I had a look in to the skb->users field, which is supposed to be the reference count.
It is set to 1, which means, that my code is the only place holding a reference to the skb.
But neither skb_free_datagram nor kfree_skb seems to free the buffer, as the rx_queue keeps growing.
And I have no clue why.
Do you have any advice?
Am I missing something?
Some more information
I am using Ubuntu 20.04 with Kernel version 5.4.0-52.
I have a simple user-land application sending UDP packets to a specific port where the kernel module is listening on.
Thank you for your help.

lwip board cannot maintain connection to another lwip board

I have a strange problem. For some time I've been trying to replace a small protocol converter (basically a two way serial to ethernet ... master and slave) that I've got for something that has more features.
Backstory
After a lot of reverse engineering I found out how the device works and I've been trying to replicate it and I've been successful in connecting my board to the device ... I've tried connecting the original as the master and my board as slave and vice versa and everything works perfectly, it's actually better since at higher speeds there are no more packet losses (connecting 2 original ones would cause packet losses).
However when I tried connecting my device as master and another one of my devices as slave .. running the exact same piece of code it works for 2 or 3 exchanges and then it stops ... eventually SOMETIMES after some minutes it will try again 2 or 3 more times.
How the tests were made
I connected a modbus master and slave (modbustools, two different instances). The master is a serial RTU modbus and the slave is an serial RTU modbus;
I configure one of my devices as master and connect it to the serial port so that it receives the serial modbus and sends the protocol to a device connected to it;
I configure my slave so that it connects via the serial port to the slave modbus. Basically it works by creating a socket and connecting to the master's IP, it then waits for a master transmission via ethernet, sends it via serial to the slave modbus (modbustools), receives a response, sends it its master and then it sends it to the modbus master (modbustools);
I's a bit confusing but that's how it works ... my master awaits a socket connection and then the communication between them starts, because that is how the old ones work.
I've written an echo client now to test the connection. Basically now, my code connects to a server (my master), it receives a packet, then it replies back the same packet that it received. When I try connecting this to my 2 boards they don't work. It's more of the same, 2 or 3 exchanges and then it stops, but when I connect it to the original device it keeps running without a hitch.
Sources
Here is my TCP master (server actually) initialization:
void initClient() {
if(tcp_modbus == NULL) {
tcp_modbus = tcp_new();
previousPort = port;
tcp_bind(tcp_modbus, IP_ADDR_ANY, port);
tcp_sent(tcp_modbus, sent);
tcp_poll(tcp_modbus, poll, 2);
tcp_setprio(tcp_modbus, 128);
tcp_err(tcp_modbus, error);
tcp_modbus = tcp_listen(tcp_modbus);
tcp_modbus->so_options |= SOF_KEEPALIVE; // enable keep-alive
tcp_modbus->keep_intvl = 1000; // sends keep-alive every second
tcp_accept(tcp_modbus, acceptmodbus);
isListening = true;
}
}
static err_t acceptmodbus(void *arg, struct tcp_pcb *pcb, err_t err) {
tcp_arg(pcb, pcb);
/* Set up the various callback functions */
tcp_recv(pcb, modbusrcv);
tcp_err(pcb, error);
tcp_accepted(pcb);
gb_ClientHasConnected = true;
}
//receives the packet, puts it in an array "ptransparentmessage->data"
//states which PCB to use in order to reply and the length that was received
static err_t modbusrcv(void *arg, struct tcp_pcb *pcb, struct pbuf *p, err_t err) {
if(p == NULL) {
return ERR_OK;
} else if(err != ERR_OK) {
return err;
}
tcp_recved(pcb, p->len);
memcpy(ptransparent.data, p->payload,p->len);
ptransparent->pcb = pcb;
ptransparent->len = p->len;
}
The serial reception is basically this:
detect one byte received, start timeout, when timeout ends send whatever was received via a TCP socket that was already connected to the server .. it then receives the packet via the acceptmodbus function and sends it via serial port.
This is my client's (slave) code:
void init_slave() {
if(tcp_client == NULL) {
tcp_client = tcp_new();
tcp_bind(tcp_client, IP_ADDR_ANY, 0);
tcp_arg(tcp_client, NULL);
tcp_recv(tcp_client, modbusrcv);
tcp_sent(tcp_client, sent);
tcp_client->so_options |= SOF_KEEPALIVE; // enable keep-alive
tcp_client->keep_intvl = 100; // sends keep-alive every 100 mili seconds
tcp_err(tcp_client, error);
err_t ret = tcp_connect(tcp_client, &addr, portCnt, connected);
}
}
The rest of the code is the identical. The only thing that changes is the flow of operation.
Connect to server
Wait for packet
send it via serial
wait for response timeout (same timeout as the server, it justs starts counting in a different way ... server starts after receiving one byte and client after it sent something via the serial port)
get response and send it to the server
Observation:
No error is detected in the communication. After some testing it doesn't seem to be the number of exchanges that causes the hang. It happens after some time. In my opinion this sounds like a disconnection problem or timeout error, but no disconnection occurs and no more packets are received. When I stop debugging and check the sockets nothing out of the ordinary is detected.
If I understood your question the right way, you have a computer with two serial ports, each running a Modbus client and server instance. From each of these ends, you then go to your STM32 boards that receive data on their serial ports and forward to TCP on an Ethernet network connecting them to each other.
Not easy to say but based on the symptoms you describe it certainly looks like you are having one or several timeout issues, likely on the serial sides. I think it won't be easy to help you pinpoint what is exactly wrong with your code without testing it and certainly not if you can't show a complete functional piece.
But what you can improve a lot is the way you debug on the end sides.
You can try replacing modbustools with something that gives you more details.
The easiest solution to get additional debugging info is to use pymodbus, you just need to install the library with pip and use the client and server provided with the examples. The only modification you need is to change them to the serial interface commenting and uncommenting a couple of lines. This will give you very useful details for debugging.
If you have a C development environment on your computer better go for libmodbus. This library has a fantastic set of unit tests. Again, you just have to edit the code to set the name of your serial ports and run server and client.
Lastly, I don't know to what extent this might be useful for you but you might want to take a look at SerialPCAP. With this tool, you can tap on an RS-485 bus and see all queries and responses running on it. I imagine you have RS-232, which is point-to-point and will not work with three devices on the bus. If so, you can try port forwarding.
EDIT: Reading your question more carefully I find this sentence particularly troublesome:
...detect one byte received, start timeout, when timeout ends send whatever was received via a TCP socket that was already connected to the server...
Why would you need to introduce this artificial delay? In Modbus, you have very well defined packages that you can identify by the minimum 3.5 frame spacing, is that what you mean by timeout?
Unrelated, but I've also remembered there is a serial forwarder example inluded with pymodbus that might somehow help you (maybe you can use it to emulate one of your boards?).

A kernel module to transparently detour packets coming from a NIC and TCP application. Is it possible to make it done?

Is it possible for a Linux kernel module to transparently detour the packet coming from upper layer (i.e. L2,L3) and NIC? For example, 1) a packet arrives from a NIC, the module gets the packet (do some processing on it) and delivers back to tcp/ip stack or 2) an app sends data, the module gets the packet (do some processing) and then, delivers the packet to an output NIC.
It is not like a sniffer, in which a copy of the packet is captured while the actual packet flow continues.
I thought on some possibilities to achieve my goal. I thought in registering a rx_handler in the kernel to get access to the incoming packets (coming from a NIC), but how to delivers back to the kernel stack? I mean, to allow the packet to follow the path that it should have taken without the module in the middle.
Moreover, let's say an app is sending a packet through TCP protocol. How the module could detour the packet (to literally get the packet)? Is it possible? In order to send it out through the NIC, I think dev_queue_xmit() does the job, but I'm not sure.
Does anyone know a possible solution? or any tips?
Basically, I'd like to know if there is a possibility to put a kernel module between the NIC and the MAC layer.. or in the MAC layer to do what I want. In positive case, does anyone has any hint like main kernel functions to use for those purposes?
Thanks in advance.
Yes. You can hook into kernel networking stack by providing customized callback in place of default sk_data_ready function.
static void my_sk_data_ready(struct sock *sk, int len) {
call_customized_logic(sk, len);
sock_def_readable(sk, len); /* call default callback or not call it */
}
Usage:
sk->sk_data_ready = my_sk_data_ready;

Sending queued packets with NFQUEUE?

I'm using libnetfilter_queue and iptables with the NFQUEUE target to store incoming packets in three different queues with --queue-num x.
I successfully create the three queues with libnetfilter_queuefunctions, bind them, listen to them and read from them as follows:
/* given 'h' as a handler of one of my three queues */
fd = nfq_fd(h);
while ((rv = recv(fd, buf, sizeof(buf), 0)) && rv >= 0) {
nfq_handle_packet(h, buf, rv);
}
The callback function, triggered with nfq_handle_packet, has the nfq_set_verdict(qh, id, NF_ACCEPT, 0, NULL); command where it sends the packet as soon it has been processed.
The problem is: I don't want every packet to be sent right away, since I need to store them in a custom struct (written below).
So I came across a potential solution: I may call NF_DROP verdict instead of NF_ACCEPT on every packet I want to queue (so it won't be immediately sent away), store it in my custom struct and then (sooner or later) re-inject it at my need.
Sounds great, but the situation is: I don't know how to re-inject my queued packets at my pleasure from my userspace application. Is correct to use nfq_set_verdict again at a same point of my code, but with NF_ACCEPT verdict? Or should I open a socket (maybe a raw one)?
This is my custom struct
struct List {
int queue;
int pktsize;
unsigned char *buffer;
struct nfq_q_handle *qh;
struct nfqnl_msg_packet_hdr *hdr;
struct List *next;
};
representing a packet caught with the rule above.
These are my queues where to store packets.
struct List *List0 = NULL; // low priority
struct List *List1 = NULL; // medium priority
struct List *List2 = NULL; // high priority
I have Ubuntu 14.04 3.13.0-57-generic.
Any suggestions would be appreciated.
Your idea makes sense. In fact I've seen a very similar scheme implemented in a commercial product I worked on. It had to process individual packets at high rates, so it would always copy the incoming packet and immediately set an NF_DROP verdict. It would then perform the processing, and if it decided that the packet should be forwarded, it would send the copy to the outbound interface. So you're not alone.
As far as I know, nfq_set_verdict can be called only once per packet. Once the verdict is set, NFQUEUE sends the packet to the destination (which is packet heaven in your case). It doesn't keep an extra copy of the packet just in case you change your mind. So to send the packet back to the network you'll have to store a copy of it and send it using your own socket. And yes, if you want to send the received packet as-is (including headers) the outbound socket would have to be raw.
I don't know if this will fit with your application model, but Frottle just holds the packets in limbo until it decides whether to accept them or drop them. The "novelty" of this approach relies on the fact that you aren't required to call nfq_set_verdict during the NFQUEUE callback function itself; you can call it later and outside the netfilter loop proper. It will use more kernel memory, but the alternative would be just to use more usermode memory so it isn't much of a loss.
Hope this helps!

How to drop tcp packet in linux kernel but do not receive again and again?

I want to change the linux kernel code to filter some tcp packet and drop it.
But I always keep receiving it again and again. Here is my code in
/net/ipv4/tcp_ipv4.c
int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
{
// my code start
struct iphdr *iph;
iph = skb->nh.iph;
if(iph->ttl > 64) // I want to drop all tcp packet that meet this requirement
{
return 0;
}
// my code end
// start normal linux code
if(sk->sk_state == TCP_ESTABLISHED) { /* Fast path */
...
}
As #nos said, TCP is reliable, so the other end will retransmit the dropped packet. You would need to send a RST or an ICMP ERROR (probably host unreachable, administratively prohibited) to teardown the connection.
Also, note that you've created a memory leak, you're responsible for freeing skb's when you discard them.
There is a ttl module for iptables, which can filter by ttl:
iptables –A INPUT -m ttl --ttl-gt 65 –j DROP
If you really wanted to, you could modify the code to send an acknowledgment for the packet, but instead drop it. I don't really recommend this.

Resources