How to push skb in specific point of Linux network stack? - c

I want to push skb to prerouting point of Linux network stack. Is there any way to do this?
I used dev_queue_xmit() and netif_rx() functions, but I don't think they can push skb in prerouting point of Linux network stack.

I'm not entirely clear where this skb originates from. In userland, there are two ways to do this. One is to use a tun device (and simply write the packet in). The other is to use libpcap (which has the little known feature of sending raw packets hidden amongst all the receive stuff).
You say you want to do this by registering an ioctl (presumably meaning adding your own ioctl) then calling the ioctl from user space. It isn't immediately obvious why you would want to do that when you can already achieve this without modifying the kernel. However, if you did want to do this, what I'd do is follow the path the tun driver uses, and do the equivalent of write()-ing to the tun device in your new ioctl. That might mean having a tun-like interface hanging around to act as the source for the packets. I suspect packets in the system need a source interface of sorts.

I suspect you want libpcap or libnetfilter_queue hereafter collectively referred to as userland queuing. Userland queuing will route all packets matching some rule (you define) to your userland application. The dev_queue_* and netif_* functions apply to kernel level routing and are not relevant to userland applications.
Userland queuing has performance implications and depending on the library you use you will have to use a specific method of re-injecting your packet back into the netfilter queues.

I think the function you are looking for is ip_queue_xmit(), which is where IPV4 implementation starts checking tables, filters, etc.

Related

Is there an option or command that I can used to disable/unload/ or stop the tcp/IP stack in linux. Need it to implement user space tcp in server app

I am working a C program that is uses sockets to implement tcp networking in a server application that I am working on. I was wondering is it possible to disable the tcp/ip stack of the kernel so my system do not interfere with incoming connection sync requests and IP packets.
Or I must compile kernel to disable it please tell if this is the case.
On this question How to create a custom packet in c?
it says
Also note that if you are trying to send raw tcp/udp packets, one problem you will have is disabling the network stack automatically processing the reply (either by treating it as addressed to an existing IP address or attempting to forward it).
If thats the case then how can it be possible.
Or is there any tool or program in Linux that can be used to achieve this like this comment Disable TCP/IP Stack from user-space
There is of course the counterintuitive approach of using additional networking functionality to disable normal networking functionality: netfilter. There are a few iptables matches/targets which might prove beneficial to you (e.g., the “owner” match that may deny or accept based on PID or UID). This still means the functionality is in the kernel, it just limits it.
if someone knows from right above then how can this be done are there any commands?
Well, you could compile yourself a kernel without networking :)
A couple of options
Check out the DPDK project (https://www.linuxjournal.com/content/userspace-networking-dpdk). DPDK passes the Physical NIC to User space via UIO driver to igb_uio|uio_pci_generic|vfio-pci. Thus eliminates Kernel Stack.
Use XDP supported NIC with either Zero-Copy or Driver-mode. with eBPF running one can push the received packets directly to User space bypassing the kernel stack.
Unless this is a homework project, remember: don't invent, reuse.
[EDIT-based on comment] Userspace TCP-IP stack have custom sock-API to read/write into the socket. So with either LD_PRELOAD or source file change, one can use the same application.

Using sock_create, accept, bind etc in kernel

I'm trying to implement an echo TCP server as a loadable kernel module.
Should I use sock_create, or sock_create_kern?
Should I use accept, or kernel_accept?
I mean it does make sense that I should use kernel_accept for example; but I don't know why. Can't I use normal sockets in the kernel?
The problem is, you are trying to shoehorn an user space application into the kernel.
Sockets (and files and so on) are things the kernel provides to userspace applications via the kernel-userspace API/ABI. Some, but not all, also have an in-kernel callable, for cases when another kernel thingy wishes to use something provided to userspace.
Let's look at the Linux kernel implementation of the socket() or accept() syscalls, in net/socket.c in the kernel sources; look for SYSCALL_DEFINE3(socket, and SYSCALL_DEFINE3(accept,, SYSCALL_DEFINE4(recv,, and so on.
(I recommend you use e.g. Elixir Cross Referencer to find specific identifiers in the Linux kernel sources, then look up the actual code in one of the official kernel Git trees online; that's what I do, anyway.)
Note how pointer arguments have a __user qualifier: this means the data pointed to must reside in user space, and that the functions will eventually use copy_from_user()/copy_to_user() to retrieve or set the data. Furthermore, the operations access the file descriptor table, which is part of the process context: something that normally only exist for userspace processes.
Essentially, this means your kernel module must create an userspace "process" (enough of one to satisfy the requirements of crossing the userspace-kernel boundary when using kernel interfaces) to "hold" the memory and file descriptors, at minimum. It is a lot of work, and in the end, it won't be any more performant than an userspace application would be. (Linux kernel developers have worked on this for literally decades. There are some proprietary operating systems where doing stuff in "kernel space" may be faster, but that is not so in Linux. The cost to do things in userspace is some context switches, and possibly some memory copies (for the transferred data).)
In particular, the TCP/IP and UDP/IP interfaces (see e.g. net/ipv4/udp.c for UDP/IPv4) do not seem to have any interface for kernel-side buffers (other than directly accessing the rx/tx socket buffers, which are in kernel memory).
You have probably heard of TUX web server, a subsystem patch to the Linux kernel by Ingo Molnár. Even that is not a "kernel module server", but more like a subsystem that an userspace process can use to implement a server that runs mostly in kernel space.
The idea of a kernel module that provides a TCP/IP and/or UDP/IP server, is simply like trying to use a hammer to drive in screws. It will work, after a fashion, but the results won't be pretty.
However, for the particular case of an echo server, it just might be possible to bolt it on top of IPv4 (see net/ipv4/) and/or IPv6 (see net/ipv6/) similar to ICMP packets (net/ipv4/icmp.c, net/ipv6/icmp.c). I would consider this route if and only if you intend to specialize in kernel-side networking stuff, as otherwise everything you'd learn doing this is very specialized and not that useful in practice.
If you need to implement something kernel-side for an exercise or something, I'd recommend steering away from "application"-type ideas (services or similar).
Instead, I would warmly recommend developing a character device driver, possibly implementing some kind of inter-process communications layer, preferably bus-style (i.e., one sender, any number of recipients). Something like that has a number of actual real-world use cases (both hardware drivers, as well as stranger things like kdbus-type stuff), so anything you'd learn doing that would be real-world applicable.
(In fact, an echo character device -- which simply outputs whatever is written to it -- is an excellent first target. Although LDD3 is for Linux kernel 2.6.10, it should be an excellent read for anyone diving into Linux kernel development. If you use a more recent kernel, just remember that the example code might not compile as-is, and you might have to do some research wrt. Linux kernel Git repos and/or a kernel source cross referencer like Elixir above.)
In short sockets are just a mechanism that enable two processes to talk, localy or remotely.
If you want to send some data from kernel to userspace you have to use kernel sockets sock_create_kern() with it's family of functions.
What would be the benefit of TCP echo server as kernel module?
It makes sense only if your TCP server provides data which is otherwise not accessible from userspace, e.g. read some post-mortem NVRAM which you can't read normally and to send it to rsyslog via socket.

How to create a custom packet in c?

I'm trying to make a custom packet using C using the TCP/IP protocol. When I say custom, I mean being able to change any value from the packet; ex: MAC, IP address and so on.
I tried searching around but I can't find anything that is actually guiding me or giving me example source codes.
How can I create a custom packet or where should I look for guidance?
A relatively easy tool to do this that is portable is libpcap. It's better known for receiving raw packets (and indeed it's better you play with that first as you can compare received packets with your hand crafted ones) but the little known pcap_sendpacket will actually send a raw packet.
If you want to do it from scratch yourself, open a socket with AF_PACKET and SOCK_RAW (that's for Linux, other OS's may vary) - for example see http://austinmarton.wordpress.com/2011/09/14/sending-raw-ethernet-packets-from-a-specific-interface-in-c-on-linux/ and the full code at https://gist.github.com/austinmarton/1922600 . Note you need to be root (or more accurately have the appropriate capability) to do this.
Also note that if you are trying to send raw tcp/udp packets, one problem you will have is disabling the network stack automatically processing the reply (either by treating it as addressed to an existing IP address or attempting to forward it).
Doing this sort of this is not as simple as you think. Controlling the data above the IP layer is relatively easy using normal socket APIs, but controlling data below is a bit more involved. Most operating systems make changing lower-level protocol information difficult since the kernel itself manages network connections and doesn't want you messing things up. Beyond that, there are other platform differences, network controls, etc that can play havoc on you.
You should look into some of the libraries that are out there to do this. Some examples:
libnet - http://libnet.sourceforge.net/
libdnet - http://libdnet.sourceforge.net/
If your goal is to spoof packets, you should read up on network-based spoofing mitigation techniques too (for example egress filtering to prevent spoofed packets from exiting a network).

Kernel bypass for UDP and TCP on Linux- what does it involve?

Per http://www.solacesystems.com/blog/kernel-bypass-revving-up-linux-networking:
[...]a network driver called OpenOnload that use “kernel bypass” techniques to run the application and network driver together in user space and, well, bypass the kernel. This allows the application side of the connection to process many more messages per second with lower and more consistent latency.
[...]
If you’re a developer or architect who has fought with context switching for years kernel bypass may feel like cheating, but fortunately it’s completely within the rules.
What are the functions needed to do such kernel bypassing?
A TCP offload engine will "just work", no special application programming needed. It doesn't bypass the whole kernel, it just moves some of the TCP/IP stack from the kernel to the network card, so the driver is slightly higher level. The kernel API is the same.
TCP offload engine is supported by most modern gigabit interfaces.
Alternatively, if you mean "running code on a SolarFlare network adapter's embedded processor/FPGA 'Application Onload Engine'", then... that's card-specific. You're basically writing code for an embedded system, so you need to say which kind of card you're using.
Okay, so the question is not straight forward to answer without knowing how the kernel handles the network stack.
In generel the network stack is made up of a lot of layers, with the lowest one being the actual hardware, typically this hardware is supported by means of drivers (one for each network interface), the nic's typically provide very simple interfaces, think recieve and send raw data.
On top of this physical connection, with the ability to recieve and send data is a lot of protocols, which are layered as well, near the bottem is the ip protocol, which basically allows you to specify the reciever of your information, while at the top you'll find TCP which supports stable connections.
So in order to answer your question, you most first figure out which part of the network stack you'll need to replace, and what you'll need to do. From my understanding of your question it seems like you'll want to keep the original network stack, and then just sometimes use your own, and in that case you should really just implement the strategy pattern, and make it possible to state which packets should be handled by which toplevel of the network stack.
Depending on how the network stack is implemented in linux, you may or may not be able to achieve this, without kernel changes. In a microkernel architecture, where each part of the network stack is implemented in its own service, this would be trivial, as you would simply pipe your lower parts of the network stack to your strategy pattern, and have this pipe the input to the required network toplevel layers.
Do you perhaps want to send and recieve raw IP packets?
Basically you will need to fill in headers and data in a ip-packet.
There are some examples here on how to send raw ethernet packets:
:http://austinmarton.wordpress.com/2011/09/14/sending-raw-ethernet-packets-from-a-specific-interface-in-c-on-linux/
To handle TCP/IP on your own, i think that you might need to disable the TCP driver in a custom kernel, and then write your own user space server that reads raw ip.
It's probably not that efficient though...

Hooking into the TCP Stack in C

It's not just a capture I'm looking to do here. I want to first capture the packet, then in real time, check the payload for specific data, remove it, inject a signature and reinject the packet into the stack to be sent on as before.
I had a read of the ipfw divert sockets using IPFW and it looks very promising. What about examples in modifying packets and reinjecting them back into the stack using divert sockets? Also, as a matter of curiosity, would it be possible to read the data from the socket using Java or would this restrict me with packing mangling and reinjecting etc?
See divert sockets: Divert Sockets mini HOWTO.
They work by passing traffic matching a certain ipfw rule to a special raw socket that can then reinject altered traffic into the network layers.
If you're just looking for packet capture, libpcap is very popular. It's used in basic tools such as tcpdump and ethereal. As far as "hooking into the stack", unless you plan on fundamentally changing the way the way the networking is implemented (i.e. add your own layer or alter the behavior of TCP), your idea of using IPF for packet modification or intervention seems like the best bet. In Linux they have a specific redirection target for userspace modules, IPF probably has something similar or you could modify IPF to do something similar.
If you are just interested in seeing the packets, then libpcap is the way to go. You can find it at: http://www.tcpdump.org/
It's possible to do this in userspace with the QUEUE or NFQUEUE iptables target I think. The client application attaches to a queue and receives all matching packets, which it can modify before they're re-injected (it can also drop them if it wants).
There is a client library libnetfilter_queue which it needs to link against. Sadly documentation is minimal, but there are some mailing list posts and examples knocking around.
For performance reasons, you won't want to do this to every packet, but only specific matching ones, which you'll have to match using standard iptables rules. If that doesn't do enough, you'll need to write your own netfilter kernel module.
I was going to echo other responses that have recommended iptables (depending on the complexity of both the patterns that you're trying to match and the packet modifications that you want to make) - until I took notice of the BSD tag on the question.
As Stephen Pellicer has already mentioned, libpcap is a good option for capturing the packets. I believe, though, that libpcap can also be used to send packets. For reference I'm pretty sure that tcpreplay uses it to replay pcap formatted files.

Resources