Network interfaces status tracking on FreeBSD - c

I'm porting some software to FreeBSD 12 (it's never been run on FreeBSD). The software needs to track the system network interfaces and react immediately on status changes. It's assumed to run with root privileges. In FreeBSD 7 there was combination of kevent and EVFILT_NETDEV but this flag has been removed from FreeBSD 8 and later with no clear replacement.
I know there is a way to retrieve the interfaces using getifaddrs but no idea how to proceed and set handlers on AF_INET and AF_INET6 devices tracking the up/down events.
devd looks promising given that it can catch the respective IFNET events, alas it's prohibited to adjust devd.conf on the target system, therefore I need to implement similar mechanism in my sfw. I have not much time to inspect the source code of devd even though I've tried and it made it even more cryptic.
Could anybody show me the right direction to go? Maybe some of the libdev* system-wide libraries?
Thanks.

Found the respective library which uses devd's multiplexing pipe. It's called libdevdctl and its source code resides in /usr/src/lib/libdevdctl, written in C++, has no extra dependencies. Combination of DevdCtl::Event::NOTIFY and DevdCtl::Consumer was enough. For some reason the shared library in /usr/lib is called libprivatedevdctl.so and according to nm output exposes the needed interface. I reckon it's just an internal library so it's easier to grab the source and use as is in your software.
Also, it has a severe drawback, it polls the socket with zero timeout in DevdCtl::Consumer::EventsPending which drastically increases CPU usage.

Related

Partially Porting PJLIB - Without IOQUEUE, select abstraction, and socket abstraction API

I would like to use the PJSIP library to implement a small SIP softphone on an embedded system. Since this embedded system does not offer Linux or support POSIX, I would like to port the PJLIB library only partially, as described here (https://www.pjsip.org/porting.htm#mozTocId30930). The threading function can be deactivated via a macro, but I'm not quite sure yet how I have to set up this new transport function or where exactly it has to be included so that I can also bypass the IOQUEUE implementation and the PJLIB socket abstraction.
On my embedded system (Keil RTX) I can allocate a UDP socket and register a callback which is called on a network event. I also have a send function which I can use to send data packets. Although I have already looked into the stack, I can't find a way to get started.
Has anyone already dared to the partial porting and can give me a brief assistance. Thank you !
See how Symbian port worked (I think it might be removed from recent versions, but it should be still downloadable) - it was also based on non-POSIX sockets. Create your own platform-specific socket file and ioqueue file.

Linux timers with O_ASYNC?

The man page for open() states for O_ASYNC:
This feature is available only for terminals, pseudoterminals,
sockets, and (since Linux 2.6) pipes and FIFOs. See fcntl(2) for
further details.
But I've used linux timers with epoll() successfully and setting O_ASYNC with fcntl() on a timer fd does not return an error. Obviously, no signals are being sent either. My question is, is it possible to get O_ASYNC working with linux timers? Are there any examples online? I know about the POSIX alternative, but was hoping to avoid it.
We can see that F_SETFL calls setfl which calls a fasync function specific to the type of file.
By searching for fasync we can see how async support is implemented in many devices. Seems that it's not too complicated as mostly it only needs the device to store the async registration and send the signal (here is the fasync function implementation for this kind of file).
Going back to setfl we can notice that if the file type's fasync function is null, it just silently succeeds. This could be a bug, or it could be intentional. Let's assume it's a bug.
Now that the bug is in the kernel, there are probably programs relying on it, which would stop working if the bug was fixed. If a program did break and someone complained about it, the fix would get undone so the program would keep working, because Linus doesn't like to break programs. If it doesn't break any programs that actually exist (which is unlikely, in my opinion), it can be fixed.
Another option is to update the documentation.
Another option is to make it actually work.
My question is, is it possible to get O_ASYNC working with linux timers?
It's unlikely (but still possible) that any program is setting O_ASYNC on a timerfd since it doesn't work - so it's unlikely that it will break compatibility. And it looks like it's not terribly complicated to implement, based on the other examples. So, go ahead and write this patch and send it to the mailing list.
If you meant if it's possible to implement on today's kernels, without a patch, the answer is no, it is not. Here is the timerfd ops structure and there is no entry for fasync
Are there any examples online?
Yes, the examples are the source code for all the other kinds of files that support fasync.

Using sock_create, accept, bind etc in kernel

I'm trying to implement an echo TCP server as a loadable kernel module.
Should I use sock_create, or sock_create_kern?
Should I use accept, or kernel_accept?
I mean it does make sense that I should use kernel_accept for example; but I don't know why. Can't I use normal sockets in the kernel?
The problem is, you are trying to shoehorn an user space application into the kernel.
Sockets (and files and so on) are things the kernel provides to userspace applications via the kernel-userspace API/ABI. Some, but not all, also have an in-kernel callable, for cases when another kernel thingy wishes to use something provided to userspace.
Let's look at the Linux kernel implementation of the socket() or accept() syscalls, in net/socket.c in the kernel sources; look for SYSCALL_DEFINE3(socket, and SYSCALL_DEFINE3(accept,, SYSCALL_DEFINE4(recv,, and so on.
(I recommend you use e.g. Elixir Cross Referencer to find specific identifiers in the Linux kernel sources, then look up the actual code in one of the official kernel Git trees online; that's what I do, anyway.)
Note how pointer arguments have a __user qualifier: this means the data pointed to must reside in user space, and that the functions will eventually use copy_from_user()/copy_to_user() to retrieve or set the data. Furthermore, the operations access the file descriptor table, which is part of the process context: something that normally only exist for userspace processes.
Essentially, this means your kernel module must create an userspace "process" (enough of one to satisfy the requirements of crossing the userspace-kernel boundary when using kernel interfaces) to "hold" the memory and file descriptors, at minimum. It is a lot of work, and in the end, it won't be any more performant than an userspace application would be. (Linux kernel developers have worked on this for literally decades. There are some proprietary operating systems where doing stuff in "kernel space" may be faster, but that is not so in Linux. The cost to do things in userspace is some context switches, and possibly some memory copies (for the transferred data).)
In particular, the TCP/IP and UDP/IP interfaces (see e.g. net/ipv4/udp.c for UDP/IPv4) do not seem to have any interface for kernel-side buffers (other than directly accessing the rx/tx socket buffers, which are in kernel memory).
You have probably heard of TUX web server, a subsystem patch to the Linux kernel by Ingo Molnár. Even that is not a "kernel module server", but more like a subsystem that an userspace process can use to implement a server that runs mostly in kernel space.
The idea of a kernel module that provides a TCP/IP and/or UDP/IP server, is simply like trying to use a hammer to drive in screws. It will work, after a fashion, but the results won't be pretty.
However, for the particular case of an echo server, it just might be possible to bolt it on top of IPv4 (see net/ipv4/) and/or IPv6 (see net/ipv6/) similar to ICMP packets (net/ipv4/icmp.c, net/ipv6/icmp.c). I would consider this route if and only if you intend to specialize in kernel-side networking stuff, as otherwise everything you'd learn doing this is very specialized and not that useful in practice.
If you need to implement something kernel-side for an exercise or something, I'd recommend steering away from "application"-type ideas (services or similar).
Instead, I would warmly recommend developing a character device driver, possibly implementing some kind of inter-process communications layer, preferably bus-style (i.e., one sender, any number of recipients). Something like that has a number of actual real-world use cases (both hardware drivers, as well as stranger things like kdbus-type stuff), so anything you'd learn doing that would be real-world applicable.
(In fact, an echo character device -- which simply outputs whatever is written to it -- is an excellent first target. Although LDD3 is for Linux kernel 2.6.10, it should be an excellent read for anyone diving into Linux kernel development. If you use a more recent kernel, just remember that the example code might not compile as-is, and you might have to do some research wrt. Linux kernel Git repos and/or a kernel source cross referencer like Elixir above.)
In short sockets are just a mechanism that enable two processes to talk, localy or remotely.
If you want to send some data from kernel to userspace you have to use kernel sockets sock_create_kern() with it's family of functions.
What would be the benefit of TCP echo server as kernel module?
It makes sense only if your TCP server provides data which is otherwise not accessible from userspace, e.g. read some post-mortem NVRAM which you can't read normally and to send it to rsyslog via socket.

BSD Packet Interception (Not Copying)

I want to get in the middle of packet forwarding (Not routing). For example, the system is a layer 2 bridge between hosts and their gateway. I want to check the layer 7 for string or whatever "foo" and forward/drop/delay the packet based on the result. What I am having trouble with is intercepting the packet.
What I have read so far:
I know I can get the copy of packet from BPF device (Usenix paper by Steven McCanne and Van Jacobson http://www.tcpdump.org/papers/bpf-usenix93.pdf ). that's good for sniffing but not for me.
I can access the PF device and set the filtering rules which is good for forwarding or dropping decisions, but not for inspection. man pf (4)
I can get packets into the ALTQ queues, BUT I do not know how to access the individual packets located in the queue. man altq(9)
I have also looking into the source code for PF(/usr/src/sys/contrib/pf/net ), PFCTL (/usr/src/contrib/pf/pfctl) and ALTQ(/usr/src/sys/contrib/altq/altq).
On FreeBSD 9.1 machine
I am not C expert, but I am good with it.
Maybe I am getting tired today with all the reading and missed something trivial. Please forgive me if so. Plus, this will be a very good find fro those looking into the subject.
P.S. There is a way of controlling the flow of "foo", by detecting "foo" in packet and denying the answer to that from coming back by setting up the filter for answer to that request. This is NOT what I am trying to achieve. I do not want the packet to leave the system if it should not.
EDIT 2 P.S. There is a great way of doing this on Linux. I can achieve everything I mentioned here on Linux with libnetfilter_queue. I will not bother posting solution here because there are many many many tutorials on how to do it on Linux.
In conclusion, I am still looking for answer on how to do this on BSD. As far as I can understand, I need to write a wrapper/library based on pf (because there is no such thing on the net - otherwise I should have found it already), that does the same thing as libnetfilter with it's libnetfilter_queue library. Or I could somehow dig into libnetfilter and port it to FreeBSD, but since it is based on iptables, only thing I can get from digging into libnetfilter library is logic and algorithms not the actual code itself, which by itself could prove to be of no use to me.
FreeBSD 9.1 has an userspace framework for packet access called netmap. It was recently introduced and has an amazing performance scale. It does very simple but powerful thing - just mmaps the NIC buffers to userspace portion of memory and detaches the packet processing from host stack, this was exactly what I needed the rest is on me.
If anyone needs any goods reference for this, please refer to man netmap (4)
Have a look at OpenDPI or nDPI.
Check out the "Divert Sockets" in BSD implementation as well. Unlike Netmap, it is not zero-copy (IMHO) however it can work with ipfw in order to implement the necessary filters in order to filter packages you want to process.

Is there a Windows equivalent for eventfd?

I am writing a cross-platform library which emulates sockets behaviour, having additional functionality in the between (App->mylib->sockets).
I want it to be the most transparent possible for the programmer, so primitives like select and poll must work accordingly with this lib.
The problem is when data becomes available (for instance) in the real socket, it will have to go through a lot of processing, so if select points to the real socket fd, app will be blocked a lot of time. I want the select/poll to unblock only when data is ready to be consumed (after my lib has done all the processing).
So I came across this eventfd which allows me to do exactly what I want, i.e. to manipule select/poll behaviour on a given fd.
Since I am much more familiarized with Linux environment, I don't know what is the windows equivalent of eventfd. Tried to search but got no luck.
Note:
Other approach would be to use another socket connected with the interface, but that seems to be so much overhead. To make a system call with all data just because windows doesn't have (appears so) this functionality.
Or I could just implement my own select, reinventing the wheel. =/
There is none. eventfd is a Linux-specific feature -- it's not even available on other UNIXy operating systems, such as BSD and Mac OS X.
Yes, but it's ridiculous. You can make a Layered Service Provider (globally installed...) that fiddles with the system's network stack. You get to implement all the WinSock2 functions yourself, and forward most of them to the underlying TCP. This is often used by firewalls or antivirus programs to insert themselves into the stack and see what's going on.
In your case, you'd want to use an ioctl to turn on "special" behaviour for your application. Whenever the app tries to create a socket, it gets forwarded to your function, which in turn opens a real TCP socket (say). Instead of returning that HANDLE though, you use a WinSock function to create ask for a dummy handle from the kernel, and give that to the application instead. You do your stuff in a thread. Then, when the app calls WinSock functions on the dummy handle, they end up in your implementation of read, select, etc. You can decouple select notifications on the dummy handle from those on the actual handle. This lets you do things like, for example, transparently give an app a socket that wraps data each way in encryption, indistinguishably from the original socket. (Almost indistinguishably! You can call some LSP APIs on a handle to find out if there's actually and underlying handle you weren't given.)
Pretty heavy-weight, and monstrous in some ways. But, it's there... Hope that's a useful overview.

Resources