Does loop back interface generates interrupt on NIC/Hardware
The loopback interface is a virtual network interface. It doesn't correspond to any actual hardware, and packets transmitted through it therefore do not generate hardware interrupts.
It's possible that linux has some concept of a "soft interrupt", i.e. an inter-thread/process signalling mechanism, and that packets over the loopback will cause these. Older Linux versions definitely had these "soft interrupts", but I'm not sure if the 2.6 series does, and I'm not sure if packets over the loopback do or ever did generate them.
Related
I am writing a simple net device driver based on the loopback driver and want to register my net_device structure. This and that page on writing a net device say to just call register_netdev. But they're writing fancy drivers with PCI express and other complicated things.
So, if I just want something like the loopback driver, I should presumably base my code on loopback.c. My question is, what does the first line of this code in loopback_net_init do:
dev_net_set(dev, net);
err = register_netdev(dev);
Apparently net is determined by this code in net_namespace.c:
register_pernet_device(ops) ...
__register_pernet_operations(list, ops)
for_each_net(net) ...
What is this looping for? What might go wrong if I skip the dev_net_set call? Why are others not using it?
AFAIK, net is a structure that will allow the kernel to interact with the device. You need it to register the device and remove it in the module cleanup function. Please review the code under linux/net/8021q/ for examples.
AFAIK, looping happens at the level of sockets (layer 5-7), whereas net_dev is used as the kernel component that immediately interacts with the driver, when you actually want to use a say, ethernet card, or SLIP,PLIP for transmitting frames (layer 2-0). Loopback happens at the level of the network subsystem of the kernel, and lies way above the drivers which interact with the hardware. So I don't see why you would need a driver to use the loopback feature. However, there is also a provision for registering a dummy device with net_dev, though I don't know if that is what you are looking for.
That said, if your intention is to simply use some driver that simulates an actual physical device without one and say, reflects the packets that it recieves, that is possible too. Basically till the net_dev layer, the kernel does all the protocol stuff (TCP/IP), and finally passes off the packet to some handle that the device driver registers with the net_dev or something similar. Similarly on receiving stuff, the device triggers an interrupt, the driver does a DMA operation, and the kernel takes over from there. Hence instead of the code for doing the DMA operation, you can make a module that simply pass over a static packet, that is compatible with ethernet/TCP/IP . In a vast majority of cases, all these aspects (the network and other subsystems) are agnostic to the underlying bus details, i.e. it shouldn't matter whether the ethernet card is connected to PCI or ISA but there can be exceptions. Thus, IMHO, you are trying to do something that should only be attempted after having a thorough understanding of the network subsystem, and a good enough understanding of the kernel as a whole. Till then you will be shooting in the dark. Sometimes you may hit, but often-times you will miss.
http://man7.org/linux/man-pages/man8/ip-netns.8.html
A network namespace is logically another copy of the network stack,
with its own routes, firewall rules, and network devices.
So for_each_net is looping over these namespaces and creating a copy of all "per net" network devices in each one.
Use ip netns list to determine whether you are using network namespaces. Often they are not used, so drivers do not necessarily need to use dev_net_set.
I want to be able to simulate an incoming packet on a certain physical network interface.
Specifically, given an array of bytes and an interface name, I want to be able to make that interface think a packet containing those bytes arrived from another interface (most likely on another machine).
I've implemented the code that prepares the packet, but I'm unsure what the next step is.
I should point out that I actually need to feed the interface with my bytes, and not use a workaround that might produce a similar results in other machines (I've seen answers to other questions mentioning the loopback interface and external tools). This code is supposed to simulate traffic on a machine that's expecting to receive traffic from certain sources via specific interfaces. Anything else will be ignored by the machine.
I'm going to stick my neck out and say this is not possible without kernel modifications, and possibly driver modifications. Note that:
There are plenty of ways of generating egress packets through a particular interface, including libpcap. But you want to generate ingress packets.
There are plenty of ways of generating ingress packets that are not through a physical interface - this is what tap/tun devices are for.
If you modify the kernel to allow direct injection of packets into a device's receive queue, that may have unexpected effects, and is still not going to be an accurate simulation of the packets arriving in hardware (e.g. they will not be constrained to the same MTU etc). Perhaps you can build an iptables extension that fools the kernel into thinking the packet came from a different interface; I'm not sure that will do what you need though.
If all you need is simulation (and you are happy with a complete simulation), build a tap/tun driver, and rename the tap interface to eth0 or similar.
Depending on which network layer you're trying to simulate, there may be a work-around.
I have had success getting ip packets into the ingress queue with an ethernet 'hairpin'. That is, by setting the source and destination MAC address to the local interface, sending the packet results in it first appearing as an egress packet, then being 'hairpinned' and also appearing as an ingress packet.
This at least works under linux using pcapplusplus (libpcap under the hood), with my wireless interface. Your millage may vary.
This will obviously only suit your needs if you're OK with modifying the ethernet header, ie only simulating a higher layer.
Here is a snippet of c++ where I spoof a rst tcp packet for a local socket:
//always use the actual device source MAC, even if we're spoofing the remote rst
// this produces a 'hairpin' from the egress to the ingress on the interface so the tcp stack actually processes the packet
// required because the tcp stack doesn't process egress packets (at least on a linux wireless interface)
pcpp::EthLayer eth(localMAC,localMAC);
pcpp::IPv4Layer ip(remoteIP, localIP);
pcpp::TcpLayer tcp(remotePort, localPort);
pcpp::Packet pac(60);
ip.getIPv4Header()->timeToLive = 255;
tcp.getTcpHeader()->rstFlag = 1;
tcp.getTcpHeader()->ackFlag = 1;
tcp.getTcpHeader()->ackNumber = pcpp::hostToNet32(src.Ack);
tcp.getTcpHeader()->sequenceNumber = pcpp::hostToNet32(src.Seq);
pac.addLayer(ð);
pac.addLayer(&ip);
pac.addLayer(&tcp);
pac.computeCalculateFields();
dev->sendPacket(&pac);
EDIT: the same code works on windows on an ethernet interface. It doesn't seem to do the same 'hairpin' judging from wireshark, but the tcp stack does process the packets.
Another solution is to create a new dummy network device driver, which will have the same functionality as the loopback interface (i.e. it will be dummy). After that you can wrap up a creation of simple tcp packet and specify in the source and destination addresses the addresses of the two network devices.
It sounds a little hard but it's worth trying - you'll learn a lot for the networking and tcp/ip stack in linux.
Cant find myself the answer for such a question:
Is there any benefit/boost to sockets in general at multi-core machine. I mean is there maybe some kind of sharing access to packets queue incoming to kernel from the ethernet-card driver or smth.
I understand that when it comes up to API call there can be multiple threads working with one socket instance, but it is up to programmer to synchronize and play correctly with calls to read/write/close/select etc. So at that level i see benefit only in working with dispatched packets and post processing etc... Or there is no speed boost until the packet copied during system call and transferred to user space?
The benefit of multi-core depends on how concurrency can be achieved in your algorithm. Take ethernet receiving as example, there are 4 tasks involved in it.
On receiving packet, the NIC hardware trigger an interrupt and CPU handle interrupt in interrupt context.
The network stack handle RX packet in software irq mechanism. The software irq requests can be run concurrently on multiple CPU. In network stack RX function, it pass network buffer to socket and wakes up user thread pendning at the socket.
The user thread waked up and continue user application code to receive or processes received network packets.
1) can only run at one CPU, 2) can run in multiple CPU and 3) can also runs on multiple CPU for multi-process or multi-thread applications.
Following from #Greg Inozemtsev's comment. NICs with multiple receive queues can filter incoming traffic into different queues. Each queue can generate a unique interrupt to the CPU to signal incoming packets. Each queue can be assigned a different interrupt that can be dispatched to a unique CPU core.
Linux supports various techniques in the Kernel such as:
RSS: Receive Side Scaling
RPS: Receive Packet Steering
RFS: Receive Flow Steering
Accelerated Receive Flow Steering
XPS: Transmit Packet Steering
Lets say all the packets are destined to your machine IP on port 80 and you have used socket() and listen() to create a socket on port 80. Traffic is coming into your NIC from a variety of source IPs so its being hashed into multiple receive queues (meaning the hardware interrupts are being spread across multiple CPU cores thanks to RSS). You can use the native kernel socket option PACKET_FANOUT to spread the load across multiple worker threads within your application.
If you look to 3rd party libraries NetMap, DPDK and VPP, just as examples, these can all be used to scale up even further by take you down to zero copy RX/TX with the caveat you need to write some of the network protocol code your self depending on which 3rd party library you use.
There are many many things to consider here, far too much to cover in one SO question. In answer to your original question; yes. Although I have tried to provide a bit of extra information.
For reading on RSS/RPS/RFS etc:
https://blog.cloudflare.com/how-to-receive-a-million-packets/
For further reading related to native PACKET_FANOUT and PACKET_MMAP:
http://kukuruku.co/hub/nix/capturing-packets-in-linux-at-a-speed-of-millions-of-packets-per-second-without-using-third-party-libraries
http://yusufonlinux.blogspot.co.uk/2010/11/data-link-access-and-zero-copy.html?m=1
https://www.kernel.org/doc/Documentation/networking/packet_mmap.txt
Further reading related to NetMap as an example of 3rd party library
https://blog.cloudflare.com/single-rx-queue-kernel-bypass-with-netmap/
Further reading on NUMA and affinity:
https://null.53bits.co.uk/index.php?page=numa-and-queue-affinity
https://blog.cloudflare.com/how-to-achieve-low-latency/
I am doing IO programming in C in Ubuntu. And I need the base address of the port to write data.
My laptop dont have a parallel port. So I bought a USB to Parallel port connector. I plugged in the device and its getting detected in /dev/usb/lp0
I ran "lsusb" to see the list of devices and I can see the ID also. But how can I get the base address ? For the usual hardware parallel devices, the base address is 0x0378. this address is not getting detected while using USB to Parallel device.
Please help.
A USB parallel port doesn't have a base address - it's not a meaningful concept for USB. I'm afraid the days of doing I/O on PC hardware via in and out instructions ended a few years ago, though lots of old tutorials still survive on the web.
You can write bytes to the parallel port as a character device, and these will appear on the printer port pins. The USB adapter will expect the other end to handshake data exactly like a printer. If you want to do general I/O prototyping, you're probably better off with a simple USB microcontroller like an Arduino.
Further discussion here.
If you are still interested to use this USB-to-parallel-printer device for your own bit-banging, it's important to know that their built-in firmware always allows controlling of D0..D7, INIT (as outputs), /ERR, ONL, PE (as inputs), but never for /ACK, BUSY (inputs), /STB, /AF, /SEL (outputs) pins.
And you need an 8-bit latch (e.g. 74HCT574) for catching data while strobing.
See here (https://www-user.tu-chemnitz.de/~ygu/bastelecke/PC/USB2LPT/faq#DIY)
especially for possible data rates.
Accessing from software side is a bit complicated but possible, and you may have to re-structure your software and hardware for making such adapters useable. I don't know for Linux case how to access, but IMHO you don't need to write a kernel-mode driver.
I am writing a simple multi-drop RS485 protocol for serial communications within a distributed system. I am using an addressable model where slave devices are given a window of 20ms to respond. The master uC polls the connected devices for updates and they respond accordingly. I've employed checksums and take the necessary overrun precautions to ensure that connected devices will not respond to malformed messages. This method has proved effective in approximately 99% of situations, but I lose the packet if a new device is introduced during a communication session. Plugging in a new device "hot" will have negative effects on the signal being monitored by the slave devices, if only for an extremely short time. I'm on the software side of engineering, but how I can mitigate this situation without trying to recreate TCP? We use a polling model because it is fast and does the job well for our application, no need for RTOS functionality. I have an abundance of cycles on each cpu, think in basic terms.
Sending packets over the RS485 is not a reliable communication. You will have to handle the lost of packets anyway. Of course, you won't have to reinvent TCP. But you will have to detect lost packets by means of timeout monitoring and sequence numbers. In simple applications this can be done at application level, what keeps you far off from the complexity of TCP. When your polling model discards all packets with invalid checksum this might be integrated with less effort.
If you want to check for collisions, that can be caused by hot plugs or misbehaving devices there are probably some improvements. Some hardware allows to read back the own transmissing. If you find a difference between sent data and receive data, you can assume a collision and repeat the packet. This will also require a kind of sequence numbering.
Perhaps I've missed something in your question, but can't you just write the master so that if a response isn't seen from a device within the allowed time, it re-polls that device?