Working of Raw Sockets in the Linux kernel - c

I'm working on integrating the traffic control layer of the linux kernel to a custom user-level network stack. I'm using raw sockets to do the same. My question is if we use raw sockets with AF_PACKET, RAW_SOCK, and IPPROTO_RAW, will the dev_queue_xmit (the function which is the starting point of the Queueing layer as far as I've read) be called? Or does the sockets interface directly call the network card driver?

SOCK_RAW indicates that the userspace program should receive the L2 (link-layer) header in the message.
IPPROTO_RAW applies the same for the L3 (IP) header.
A userspace program sets SOCK_RAW, IPPROTO_RAW to manually parse or/and compose protocol headers of a packet. It guarantees that the kernel doesn't modify the corresponding layer header on the way to/from the userspace. The raw socket doesn't change the way the packet gets received or transmitted - those are queued as usual. From the network driver perspective, it doesn't matter who set the headers - the userspace (raw sockets) or the kernel (e.g., SOCK_DGRAM).
Keep in mind that getting raw packets requires CAP_NET_RAW capability - usually, the program needs to run with superuser privileges.

Related

Berkeley raw socket exclusive access (Linux)

I have implemented my own raw socket operating on „raw“ Ethernet frames (socket(AF_PACKET,SOCK_RAW,htons(ETH_P_ALL));) and bound it to one specific network Interface. Sending and receiving raw packets works like a charm, however when I use wireshark I can still see more traffic then I have introduced (for example ARP packets, .. ). This is expected but not wanted.
Is there a way (either in code or by „hardening“ the Ethernet interface through modifying settings) to disable the kernel IP Processing layer (or better to say all layers above the Ethernet layer) to only allow raw sockets traffic?
Referring to that image when talking about layers: https://www.opensourceforu.com/2015/03/a-guide-to-using-raw-sockets/

How does the AF_PACKET socket work in Linux?

I am trying to write a C sniffer for Linux, and understand the actions happening in the kernel while sniffing.
I am having troubles finding an answer for the following question:
If I initialize my socket in the following way:
sock_raw = socket(AF_PACKET , SOCK_RAW , htons(ETH_P_ALL));
What happens in the kernel? How am I seeing all the incoming and outgoing packets, but not "hijacking" them? Because what I have understood do far is that when the kernel receives a packet, it sends it to the relevant protocol handler function. Therefore I can't understand - does the kernel clone the packet and sends it in addition to the socket I opened?
What happens in the kernel?
The kernel simply duplicates the packets as soon as it receives them from the physical layer (for incoming packets) or just before sending them out to the physical layer (for outgoing packets). One copy of each packet is sent to your socket (if you use ETH_PH_ALL then you are listening on all interfaces, but you could also bind(2) to a particular one). After a copy is sent to your socket, the other copy then continues being processed like it normally would (e.g. identifying and decoding the protocol, checking firewall rules, etc).
How am I seeing all the incoming and outgoing packets, but not "hijacking" them?
In order for hijacking to happen, you would need to write data to the socket injecting new packets (accurately crafted depending on the protocol you want to hijack). If you only read incoming packets, you are merely sniffing, without hijacking anything.
does the kernel clone the packet and sends it in addition to the socket I opened?
Yes, that's basically what happens. This image could help you visualize it.
man 7 packet also describes this:
Packet sockets are used to receive or send raw packets at the device driver (OSI Layer 2) level. They allow the user to implement protocol modules in user space on top of the physical layer.
The socket_type is either SOCK_RAW for raw packets including the link-level header or SOCK_DGRAM for cooked packets with the link-level header removed. The link-level header information is available in a common format in a sockaddr_ll structure. protocol is the IEEE 802.3 protocol number in network byte order. See the <linux/if_ether.h> include file for a list of allowed protocols. When protocol is set to htons(ETH_P_ALL), then all protocols are received. All incoming packets of that protocol type will be passed to the packet socket before they are passed to the protocols implemented in the kernel.

What is the internal mechanics of socket() function?

I am trying to use the BlueZ HCI function:
int hci_open_dev(int dev_id) {...}
which internally tries to create a socket like this:
socket(AF_BLUETOOTH, SOCK_RAW | SOCK_CLOEXEC, BTPROTO_HCI);
I tried to understand the linux kernel code for socket() but feel lost.
Id like to know what exactly does it mean to create a socket for the given domain (AF_BLUETOOTH), data transmission type (SOCK_RAW) and protocol (BTPROTO_HCI).
The man page just states that it takes these params, creates a socket and returns a device descriptor.
But id like to understand what exactly happens and the exact kernel steps involved in creating a socket.
Here is a very broad description (hope that helps understanding the main scheme).
Kernel developers will probably be horrified...
A socket is common abstract interface for many different communication means.
It provides many generic operations, such as closing, sending/receiving data, setting/retrieving options, which can be used on almost any kind of socket.
Creating a socket implies specifying the exact properties of this communication means.
It's a bit like the instantiation of a concrete type implementing an interface.
These properties are first organised by protocol families; this is the first argument to the socket() call.
For example:
PF_INET is used for communications relying on IPv4,
PF_INET6 is used for communications relying on IPv6,
PF_LOCAL is used for inter-process communication inside the system (kind of pipe),
PF_NETLINK is used for communication with the OS kernel,
PF_PACKET is used for direct communication with network interfaces,
... (there exist many of them)
Once a protocol family is chosen, you have to specify, which protocol you want to use amongst those which are provided by this family; this is the second argument to the socket() call.
For example:
SOCK_DGRAM is used for UDP over IPv4 or IPv6, or distinct messages in PF_LOCAL,
SOCK_STREAM is used for TCP over IPv4 or IPv6, or a continuous byte stream in PF_LOCAL,
SOCK_RAW, accesses directly is the raw underlying protocol in the family if any (IPv4, or IPv6 for example),
... (each family can provide many on them)
Some protocols can accept some variants or some restrictions; this is the third argument to the socket() call.
Often 0 is sufficient, but for example we can find:
PF_PACKET, SOCK_RAW, htons(ETH_P_ALL) to capture any kind of network packet received on a network interface,
PF_PACKET, SOCK_RAW, htons(ETH_P_ARP) to capture only ARP frames,
When we ask for the creation of a socket with these three arguments, the operating system creates an internal resource associated with the socket handle which will be obtained.
Of course, the exact structure of this resource depends on the chosen family/protocol/variant, and it is associated to kernel callbacks which are specific to it.
Each time an operation in invoked on this socket (through a system call), the specific callback will be called.
Please look here: it's a good high-level description of the BlueZ Linux implemention of the Bluetooth stack:
Linux Without Wires The Basics of Bluetooth. Specifically, it gives you a good overview of these BlueZ kernel drivers:
bluetooth.ko, which contains core infrastructure of BlueZ. It exports sockets of the Bluetooth family AF_BLUETOOTH. All BlueZ
modules utilise its services.
Bluetooth HCI packets are transported over UART or USB. The corresponding BlueZ HCI implementation is hci_uart.ko and hci_usb.ko.
The L2CAP layer of Bluetooth, which is responsible for segmentation, reassembly and protocol multiplexing, is implemented by l2cap.ko.
With the help of bnep.ko, TCP/IP applications can run over Bluetooth. This emulates an Ethernet port over the L2CAP layer. The
kernel thread named kbnepd is responsible for BNEP connections.
rfcomm.ko is responsible for running serial port applications like the terminal. This emulates serial ports over the L2CAP layer. The
kernel thread named krfcommd is responsible for RFCOMM connections.
hidp.ko implements the HID (human interface device) layer. The user mode daemon hidd allows BlueZ to handle input devices like Bluetooth
mice.
sco.ko implements the synchronous connection oriented (SCO) layer to handle audio. SCO connections do not specify a channel to connect to a
remote host; only the host address is specified.
Another excellent resource is the BlueZ project page:
http://www.bluez.org/

How to implement an ethernet modem

Okay, what I want to do, as a training exercise, is to implement something like this
client --ethernet--> Modem1 --GPIO--> Modem2 --ethernet--> My Home Router
Where the client connects to Modem1 using an ethernet cable.
Modem1 is a Raberry PI, converting the signal and relaying it via the GPIO
Modem2 is a Raberry PI, receives the data from the GPIO, and send it via the ethernet cable to my home router
I want to implement the Modems, but have little idea where to start.
I have read up a little on ethernet programming, but still can't find answers to the "simple stuff" like.
How do I implement Modem1 so that when its connected to the client, the client discovers it as an internet connection.
On the Modem2 end, how do I make "My Home Router" send packets meant for the "client" to Modem2, so that Modem2 may forward them.
and possibly things I haven't though of....
So, how, concretely, can I implement this? preferably in c.
I'd venture to say you might be able to write some sort of custom GPIO intermediate layer.
Read Ethernet->Encapsulate->Write GPIO->|->Read GPIO->Decapsulate->Write Ethernet
(and vice versa)
The problem then becomes: How can both modems act as "Ethernet proxies"?
Modem1 acts as a proxy for the router. Modem2 acts as a proxy for the client. If your Raspberry Pi can spoof MAC addresses, you might be able to fool Ethernet peers into communicating with your modems' Ethernet port. The reason why you need to spoof MAC addresses is that in TCP/IP networking, there is the ARP table, which maps remote IP addresses to the MAC address that can route IP packets to/from them. This is what allows your client to communicate to your router over TCP/IP.
Another potential pitfall is where your modem communication introduces delays that interfere with the Ethernet layer's handling of the protocol. For example, the Ethernet protocol may have real-time constraints that could be shattered if you introduce delays...
But let's assume anything is possible in a perfect world...
You'll need to write code for reading/writing Ethernet messages (I've seen open source code for reading/writing Ethernet packets over raw sockets in Linux)
You'll need to write a custom driver for your GPIO comms.
This means implementing a carefully thought-out protocol to manage pins state, start-of-message, end-of-message, data-payload, checksum, whatever...
Finally, you'll need to write a top-level communications layer that implements:
Ethernet-to-GPIO process:
a) read from Ethernet port, encapsulates Ethernet packet into a custom message (or message fragments)
b) communicate this custom message, using your custom GPIO protocol driver, to the external GPIO peer
GPIO-to-Ethernet process:
a) Read from GPIO, using your custom driver code
b) Decapsulate Ethernet packet
c) Write Ethernet packet to Ethernet port.
these two processes run forever...
Again, all hinges on whether or not your modems can insert themselves in an peer-to-peer connection without disturbing the natural flow of the Ethernet protocol...
As for the 'C' part...
If you use open source libraries (or code snippets) for reading/writing raw Ethernet via raw sockets, that is most likely written in C.
Your GPIO code will read write from the GPIO pins in one of two ways: from a memory mapped H/W address, or using ioport calls on that H/W address.
Receive raw Ethernet frames in Linux
Send a raw Ethernet frame in Linux
Good luck

Low latency packet processing with shared memory on Linux?

If I was to receive UDP packets on Linux (and I didn't mind changing some of the source code) what would be the fastest way for my application to read the packets?
Would I want to modify the network stack so that once a UDP packet is received it is written to shared memory and have the application access that memory?
Would there be any way for the stack to notify the application to react, rather than have the application continuously poll the shared memory?
Any advice/further resources are welcome- I have only seen:
http://www.kegel.com/c10k.html
If latency is a problem and the default UDP network stack does not perform as you wish, then try to use different existing (installable) network stacks.
Example, try UDP Lite, compare to the standard UDP stack, this particular stack does not perform any checksum on the UDP datagram, thus reducing latencies at the cost of providing corrupted datagram to the application layer.
Side note: you do not need to have a "polling" mechanism. Read the manual of select (and it's possible derivative like pselect or ppoll), with such API, the kernel will "wake up" your application as soon as it has something to read or write in the pipeline.

Resources