C: Detecting how much data was written to a tap - c

I am working on a program where I'm reading from a Tap. The only issue is, I have no clue how to detect the end of one transmission to the tap and the start of another.
Does reading from the tap act the same way as a SOCK_STREAM ?

Tun/tap tries to look like a regular ethernet controller, but the tap device itself is accessed just like any other file descriptor.
Since it pretends to be an ethernet controller, you have to know in advance how big the ethernet frame itself was that was transmitted - this comes either from the software bridge that the tap device was attached to or the "length" field in the raw ethernet frame.
This, of course can only be the maximum of the MTU size of the tap device, which typically defaults to 1500 bytes.
So, before you do a read() on the file descriptor for the tap device, you've gotta figure out how big the ethernet frame actually is.

Related

Linux UART imx8 how to quickly detect frame end?

I have an imx8 module running Linux on my PCB and i would like some tips or pointers on how to modify the UART driver to allow me to be able to detect the end of frame very quickly (less than 2ms) from my user space C application. The UART frame does not have any specific ending character or frame length. The standard VTIME of 100ms is much too long
I am reading from a Sim card, i have no control over the data, no control over the size or content of the data. I just need to detect the end of frame very quickly. The frame could be 3 bytes or 500. The SIM card reacts to data that it receives, typically I send it a couple of bytes and then it will respond a couple of ms later with an uninterrupted string of bytes of unknown length. I am using an iMX8MP
I thought about using the IDLE interrupt to detect the frame end. Turn it on when any byte is received and off once the idle interrupt fires. How can I propagate this signal back to user space? Or is there an existing method to do this?
Waiting for an "idle" is a poor way to do this.
Use termios to set raw mode with VTIME of 0 and VMIN of 1. This will allow the userspace app to get control as soon as a single byte arrives. See:
How to read serial with interrupt serial?
How do I use termios.h to configure a serial port to pass raw bytes?
How to open a tty device in noncanonical mode on Linux using .NET Core
But, you need a "protocol" of sorts, so you can know how much to read to get a complete packet. You prefix all data with a struct that has (e.g.) A type and a payload length. Then, you send "payload length" bytes. The receiver gets/reads that fixed length struct and then reads the payload which is "payload length" bytes long. This struct is always sent (in both directions).
See my answer: thread function doesn't terminate until Enter is pressed for a working example.
What you have/need is similar to doing socket programming using a stream socket except that the lower level is the UART rather than an actual socket.
My example code uses sockets, but if you change the low level to open your uart in raw mode (as above), it will be very similar.
UPDATE:
How quickly after the frame finished would i have the data at the application level? When I try to read my random length frames currently reading in 512 byte chunks, it will sometimes read all the frame in one go, other times it reads the frame broken up into chunks. –
Engo
In my link, in the last code block, there is an xrecv function. It shows how to read partial data that comes in chunks.
That is what you'll need to do.
Things missing from your post:
You didn't post which imx8 board/configuration you have. And, which SIM card you have (the protocols are card specific).
And, you didn't post your other code [or any code] that drives the device and illustrates the problem.
How much time must pass without receiving a byte before the [uart] device is "idle"? That is, (e.g.) the device sends 100 bytes and is then finished. How many byte times does one wait before considering the device to be "idle"?
What speed is the UART running at?
A thorough description of the device, its capabilities, and how you intend to use it.
A uart device doesn't have an "idle" interrupt. From some imx8 docs, the DMA device may have an "idle" interrupt and the uart can be driven by the DMA controller.
But, I looked at some of the linux kernel imx8 device drivers, and, AFAICT, the idle interrupt isn't supported.
I need to read everything in one go and get this data within a few hundred microseconds.
Based on the scheduling granularity, it may not be possible to guarantee that a process runs in a given amount of time.
It is possible to help this a bit. You can change the process to use the R/T scheduler (e.g. SCHED_FIFO). Also, you can use sched_setaffinity to lock the process to a given CPU core. There is a corresponding call to lock IRQ interrupts to a given CPU core.
I assume that the SIM card acts like a [passive] device (like a disk). That is, you send it a command, and it sends back a response or does a transfer.
Based on what command you give it, you should know how many bytes it will send back. Or, it should tell you how many optional bytes it will send (similar to the struct in my link).
The method you've described (e.g.) wait for idle, then "race" to get/process the data [for which you don't know the length] is fraught with problems.
Even if you could get it to work, it will be unreliable. At some point, system activity will be just high enough to delay wakeup of your process and you'll miss the window.
If you're reading data, why must you process the data within a fixed period of time (e.g. 100 us)? What happens if you don't? Does the device catch fire?
Without more specific information, there are probably other ways to do this.
I've programmed such systems before that relied on data races. They were unreliable. Either missing data. Or, for some motor control applications, device lockup. The remedy was to redesign things so that there was some positive/definitive way to communicate that was tolerant of delays.
Otherwise, I think you've "fallen in love" with "idle interrupt" idea, making this an XY problem: https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem

Can a linux socket return data less than the underlying packet? [duplicate]

When will a TCP packet be fragmented at the application layer? When a TCP packet is sent from an application, will the recipient at the application layer ever receive the packet in two or more packets? If so, what conditions cause the packet to be divided. It seems like a packet won't be fragmented until it reaches the Ethernet (at the network layer) limit of 1500 bytes. But, that fragmentation will be transparent to the recipient at the application layer since the network layer will reassemble the fragments before sending the packet up to the next layer, right?
It will be split when it hits a network device with a lower MTU than the packet's size. Most ethernet devices are 1500, but it can often be smaller, like 1492 if that ethernet is going over PPPoE (DSL) because of the extra routing information, even lower if a second layer is added like Windows Internet Connection Sharing. And dialup is normally 576!
In general though you should remember that TCP is not a packet protocol. It uses packets at the lowest level to transmit over IP, but as far as the interface for any TCP stack is concerned, it is a stream protocol and has no requirement to provide you with a 1:1 relationship to the physical packets sent or received (for example most stacks will hold messages until a certain period of time has expired, or there are enough messages to maximize the size of the IP packet for the given MTU)
As an example if you sent two "packets" (call your send function twice), the receiving program might only receive 1 "packet" (the receiving TCP stack might combine them together). If you are implimenting a message type protocol over TCP, you should include a header at the beginning of each message (or some other header/footer mechansim) so that the receiving side can split the TCP stream back into individual messages, either when a message is received in two parts, or when several messages are received as a chunk.
Fragmentation should be transparent to a TCP application. Keep in mind that TCP is a stream protocol: you get a stream of data, not packets! If you are building your application based on the idea of complete data packets then you will have problems unless you add an abstraction layer to assemble whole packets from the stream and then pass the packets up to the application.
The question makes an assumption that is not true -- TCP does not deliver packets to its endpoints, rather, it sends a stream of bytes (octets). If an application writes two strings into TCP, it may be delivered as one string on the other end; likewise, one string may be delivered as two (or more) strings on the other end.
RFC 793, Section 1.5:
"The TCP is able to transfer a
continuous stream of octets in each
direction between its users by
packaging some number of octets into
segments for transmission through the
internet system."
The key words being continuous stream of octets (bytes).
RFC 793, Section 2.8:
"There is no necessary relationship
between push functions and segment
boundaries. The data in any particular
segment may be the result of a single
SEND call, in whole or part, or of
multiple SEND calls."
The entirety of section 2.8 is relevant.
At the application layer there are any number of reasons why the whole 1500 bytes may not show up one read. Various factors in the internal operating system and TCP stack may cause the application to get some bytes in one read call, and some in the next. Yes, the TCP stack has to re-assemble the packet before sending it up, but that doesn't mean your app is going to get it all in one shot (it is LIKELY will get it in one read, but it's not GUARANTEED to get it in one read).
TCP tries to guarantee in-order delivery of bytes, with error checking, automatic re-sends, etc happening behind your back. Think of it as a pipe at the app layer and don't get too bogged down in how the stack actually sends it over the network.
This page is a good source of information about some of the issues that others have brought up, namely the need for data encapsulation on an application protocol by application protocol basis Not quite authoritative in the sense you describe but it has examples and is sourced to some pretty big names in network programming.
If a packet exceeds the maximum MTU of a network device it will be broken up into multiple packets. (Note most equipment is set to 1500 bytes, but this is not a necessity.)
The reconstruction of the packet should be entirely transparent to the applications.
Different network segments can have different MTU values. In that case fragmentation can occur. For more information see TCP Maximum segment size
This (de)fragmentation happens in the TCP layer. In the application layer there are no more packets. TCP presents a contiguous data stream to the application.
A the "application layer" a TCP packet (well, segment really; TCP at its own layer doesn't know from packets) is never fragmented, since it doesn't exist. The application layer is where you see the data as a stream of bytes, delivered reliably and in order.
If you're thinking about it otherwise, you're probably approaching something in the wrong way. However, this is not to say that there might not be a layer above this, say, a sequence of messages delivered over this reliable, in-order bytestream.
Correct - the most informative way to see this is using Wireshark, an invaluable tool. Take the time to figure it out - has saved me several times, and gives a good reality check
If a 3000 byte packet enters an Ethernet network with a default MTU size of 1500 (for ethernet), it will be fragmented into two packets of each 1500 bytes in length. That is the only time I can think of.
Wireshark is your best bet for checking this. I have been using it for a while and am totally impressed

Where does the Source MAC comes from before Encapsulating Frames in Layer2?

packet arrives from application->...Network->(IP encapsulation added here as per IP configuration)->goes down to Data Link Layer Here the Framing is Done and SourceMac and Dest Mac is added for LAN Switching. Does every time the SourceMac is extracted from the HostNIC and encapsulated into the packet before sending out the interface? or there is some configuration file it reads from?
I am assuming /etc/network/interfaces file is empty and not having any hw-addr address to change the MAC[ifconfig eth0 hw ether (Macwe want tochange AA:BB:CC....) Command]. where does it gets its own MAC?
does it do a Lookup everytime say 'ifconfig eth0 |grep HWaddr' and fetches own MAC or similar via system call? coz that will add a huge overhead querying the NIC chipset everytime. Or does it maintain a file reads from it and simply encapsulates the packet coming from the upper layer and sends out from the wire?
None of the above. The MAC adds its own address to Ethernet frames on the way out; the software doesn't have to add it.
On occasion, though, it is useful for the driver to know the physical address of the chip it's driving; this doesn't require querying the NIC every time or "maintaining a file"; 6 bytes of RAM in the driver's data structure does the job just fine. This is where the value displayed by ifconfig comes from.

copy_to_user works, memcpy fails

I'm programming for device on sh4-arch linux 2.6.32 who plays video stream from UDP socket.
I've got a ring buffer in user space (called ubuffer) which decoder (kernel space) reads. ubuffer was mapped from kernel space buffer (called kbuffer).
Existing application (with source code) reads data from UDP socket to ubuffer in user space. For decreasing of delay, I've made a driver who reads payload data from UDP sk_buf directly to ring buffer. Right now I'm using ioctl from existing app, who sends an ubuffer to ioctl_proc for every UDP packet. ioctl_proc making copy_to_user(ubuffer+off, skb->data+skb_off, len) and all works fine. I need to make it without any userspace app and ioctl. So I've send kbuffer to my driver and make memcpy(kbuffer+off, skb->data+skb_off, len) instead of copy_to_user (that's the only difference) with the same parameters and it not works: I'm see the picture with 10% of valid pixel (some text and colors), but rest are fails. If kbuffer was wrong, I'll shall not see anything right, but I see something.
Value of len is about 100-1300 bytes.
What I'm doing wrong? Maybe I must to use some page-aligned calls?

Broadcasting UDP packets using multiple NICs

I'm building an embedded system for a camera controller in Linux (not real-time). I'm having a problem getting the networking to do what I want it to do. The system has 3 NICs, 1 100base-T and 2 gigabit ports. I hook the slower one up to the camera (that's all it supports) and the faster ones are point-to-point connections to other machines. What I am attempting to do is get an image from the camera, do a little processing, then broadcast it using UDP to each of the other NICs.
Here is my network configuration:
eth0: addr: 192.168.1.200 Bcast 192.168.1.255 Mask: 255.255.255.0 (this is the 100base-t)
eth1: addr: 192.168.2.100 Bcast 192.168.2.255 Mask: 255.255.255.0
eth2: addr: 192.168.3.100 Bcast 192.168.3.255 Mask: 255.255.255.0
The image is coming in off eth0 in a proprietary protocol, so it's a raw socket. I can broadcast it to eth1 or eth2 just fine. But when I try to broadcast it to both, one after the other, I get lots of network hiccups and errors on eth0.
I initialize the UDP sockets like this:
sock2=socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP); // Or sock3
sa.sin_family=AF_INET;
sa.sin_port=htons(8000);
inet_aton("192.168.2.255",&sa.sin_addr); // Or 192.168.3.255
setsockopt(sock2, SOL_SOCKET, SO_BROADCAST, &broadcast, sizeof(broadcast));
bind(sock2,(sockaddr*)&sa,sizeof(sa));
sendto(sock2,&data,sizeof(data),0,(sockaddr*)&sa,sizeof(sa)); // sizeof(data)<1100 bytes
I do this for each socket separately, and call sendto separately. When I do one or the other, it's fine. When I try to send on both, eth0 starts getting bad packets.
Any ideas on why this is happening? Is it a configuration error, is there a better way to do this?
EDIT:
Thanks for all the help, I've been trying some things and looking into this more. The issue does not appear to be broadcasting, strictly speaking. I replaced the broadcast code with a unicast command and it has the same behavior. I think I understand the behavior better, but not how to fix it.
Here is what is happening. On eth0 I am supposed to get an image every 50ms. When I send out an image on eth1 (or 2) it takes about 1.5ms to send the image. When I try to send on both eth1 and eth2 at the same time it takes about 45ms, occasionally jumping to 90ms. When this goes beyond the 50ms window, eth0's buffer starts to build. I lose packets when the buffer gets full, of course.
So my revised question. Why would it go from 1.5ms to 45ms just by going from one ethernet port to two?
Here is my initialization code:
sock[i]=socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP);
sa[i].sin_family=AF_INET;
sa[i].sin_port=htons(8000);
inet_aton(ip,&sa[i].sin_addr);
//If Broadcasting
char buffer[]="eth1" // or eth2
setsockopt(sock[i],SOL_SOCKET,SO_BINDTODEVICE,buffer,5);
int b=1;
setsockopt(sock[i],SOL_SOCKET,SO_BROADCAST,&b,sizeof(b));
Here is my sending code:
for(i=0;i<65;i++) {
sendto(sock[0],&data[i],sizeof(data),0,sa[0],sizeof(sa[0]));
sendto(sock[1],&data[i],sizeof(data),0,sa[1],sizeof(sa[1]));
}
It's pretty basic.
Any ideas? Thanks for all your great help!
Paul
Maybe your UDP stack runs out of memory?
(1) Check /proc/sys/net/ipv4/udp_mem (see man 7 udp for details). Make sure that the first number is at least 8x times the image size. This sets the memory for all UDP sockets in the system.
(2) Make sure you per-socket buffer for sending socket is big enough. Use setsockopt(sock2, SOL_SOCKET, SO_SNDBUF, image_size*2) to set send buffer on both sockets. You might need to increase maximumu allowed value in /proc/sys/net/core/wmem_max. See man 7 socket for details.
(3) You might as well increase RX buffer for receiving socket. Write a big number to .../rmem_max, then use SO_RCVBUF to increase the receiving buffer size.
A workaround until this issue is actually solved may be to createa bridge for eth1+eth2 and send the packet to that bridge.
Thus it's only mapped to kernel-memory once and not twice per image.
It's been a long time, but I found the answer to my question, so I thought I would put it here in case anyone else ever finds it.
The two Gigabit Ethernet ports were actually on a PCI bridge off the PCI-express bus. The PCI-express bus was internal to the motherboard, but it was a PCI bus going to the cards. The bridge and the bus did not have enough bandwidth to actually send out the images that fast. With only one NIC enabled the data was sent to the buffer and it looked very quick to me, but it took much longer to actually get through the bus, out the card, and on to the wire. The second NIC was slower because the buffer was full. Although changing the buffer size masked the problem, it did not actually send the data out any faster and I was still getting dropped packets on the third NIC.
In the end, the 100Base-T card was actually built onto the motherboard, therefore had a faster bus to it, resulting in overall faster bandwidth than the gigabit ports.. By switching the camera to a gigabit line and one of the gigabit lines to the 100Base-T line I was able to meet the requirements.
Strange.

Resources