STM32 Serial DMA - Finding the beginning of the Stream - c

I have a known serial stream format that I am capturing via the DMA. It has header and footer bytes. But sometimes the MCU starts capturing in the middle of the stream and then the sync is out because the DMA is looking for a set number of bytes. I have read of people using circular buffers, but I have struggled to grasp this concept.
Instead, I was thinking of disabling the DMA and enabling the a serial interrupt at the start up of the MCU. Then cycle through each byte that is captured by the interrupt to find the start byte. Then, once I have found the start byte, disable the serial interrupt capturing and enable the DMA to take over the capturing of the stream.
Does this sound feasible? Thanks for any input.
I am using STM32 HAL libs with the new STM32 IDE that includes STM32 CubeMX.

If I understand your reference to circular buffers correctly, the concept is simple. You have a large buffer with a write pointer and a read pointer. The write function writes data into the buffer from the write pointer onward, taking care that once it reaches the end of the buffer, it wraps around and dumps the data at the beginning of the buffer and onward. Then you need a reader function that reads the data from the read pointer onward, and again, taking care of wrap around at the end of the buffer.
Both the read and write pointers start at the beginning of the buffer. The two conditions that you have to check are:
1) When the read pointer is at the same location as the write pointer, there is nothing (more) to read.
2) When the write pointer increments and runs into the read pointer location, you have a buffer overflow condition. This should never happen, so either you must use a larger buffer, or have the reader task runs more frequently, or you start throwing things out.
So in your scenario, the DMA just dumps data, and your reader task looks for the header bytes and processes the data until it finds the footer bytes.

As the protocol has idle gaps between packets, you can use the idle interrupt feature of the UART to synchronize the receiver.
Enable the UART interrupt, simply start receiving with DMA, and set UARTx->CR1 |= USART_CR1_IDLEIE. Whenever the idle interrupt is triggered, look at the DMA channel, if it's still running, stop the transfer and discard the input buffer (as this means that the receive was started in the middle of the packet) and start receiving the next packet.

Related

Linux UART imx8 how to quickly detect frame end?

I have an imx8 module running Linux on my PCB and i would like some tips or pointers on how to modify the UART driver to allow me to be able to detect the end of frame very quickly (less than 2ms) from my user space C application. The UART frame does not have any specific ending character or frame length. The standard VTIME of 100ms is much too long
I am reading from a Sim card, i have no control over the data, no control over the size or content of the data. I just need to detect the end of frame very quickly. The frame could be 3 bytes or 500. The SIM card reacts to data that it receives, typically I send it a couple of bytes and then it will respond a couple of ms later with an uninterrupted string of bytes of unknown length. I am using an iMX8MP
I thought about using the IDLE interrupt to detect the frame end. Turn it on when any byte is received and off once the idle interrupt fires. How can I propagate this signal back to user space? Or is there an existing method to do this?
Waiting for an "idle" is a poor way to do this.
Use termios to set raw mode with VTIME of 0 and VMIN of 1. This will allow the userspace app to get control as soon as a single byte arrives. See:
How to read serial with interrupt serial?
How do I use termios.h to configure a serial port to pass raw bytes?
How to open a tty device in noncanonical mode on Linux using .NET Core
But, you need a "protocol" of sorts, so you can know how much to read to get a complete packet. You prefix all data with a struct that has (e.g.) A type and a payload length. Then, you send "payload length" bytes. The receiver gets/reads that fixed length struct and then reads the payload which is "payload length" bytes long. This struct is always sent (in both directions).
See my answer: thread function doesn't terminate until Enter is pressed for a working example.
What you have/need is similar to doing socket programming using a stream socket except that the lower level is the UART rather than an actual socket.
My example code uses sockets, but if you change the low level to open your uart in raw mode (as above), it will be very similar.
UPDATE:
How quickly after the frame finished would i have the data at the application level? When I try to read my random length frames currently reading in 512 byte chunks, it will sometimes read all the frame in one go, other times it reads the frame broken up into chunks. –
Engo
In my link, in the last code block, there is an xrecv function. It shows how to read partial data that comes in chunks.
That is what you'll need to do.
Things missing from your post:
You didn't post which imx8 board/configuration you have. And, which SIM card you have (the protocols are card specific).
And, you didn't post your other code [or any code] that drives the device and illustrates the problem.
How much time must pass without receiving a byte before the [uart] device is "idle"? That is, (e.g.) the device sends 100 bytes and is then finished. How many byte times does one wait before considering the device to be "idle"?
What speed is the UART running at?
A thorough description of the device, its capabilities, and how you intend to use it.
A uart device doesn't have an "idle" interrupt. From some imx8 docs, the DMA device may have an "idle" interrupt and the uart can be driven by the DMA controller.
But, I looked at some of the linux kernel imx8 device drivers, and, AFAICT, the idle interrupt isn't supported.
I need to read everything in one go and get this data within a few hundred microseconds.
Based on the scheduling granularity, it may not be possible to guarantee that a process runs in a given amount of time.
It is possible to help this a bit. You can change the process to use the R/T scheduler (e.g. SCHED_FIFO). Also, you can use sched_setaffinity to lock the process to a given CPU core. There is a corresponding call to lock IRQ interrupts to a given CPU core.
I assume that the SIM card acts like a [passive] device (like a disk). That is, you send it a command, and it sends back a response or does a transfer.
Based on what command you give it, you should know how many bytes it will send back. Or, it should tell you how many optional bytes it will send (similar to the struct in my link).
The method you've described (e.g.) wait for idle, then "race" to get/process the data [for which you don't know the length] is fraught with problems.
Even if you could get it to work, it will be unreliable. At some point, system activity will be just high enough to delay wakeup of your process and you'll miss the window.
If you're reading data, why must you process the data within a fixed period of time (e.g. 100 us)? What happens if you don't? Does the device catch fire?
Without more specific information, there are probably other ways to do this.
I've programmed such systems before that relied on data races. They were unreliable. Either missing data. Or, for some motor control applications, device lockup. The remedy was to redesign things so that there was some positive/definitive way to communicate that was tolerant of delays.
Otherwise, I think you've "fallen in love" with "idle interrupt" idea, making this an XY problem: https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem

Implement UART frame controller

I'm programming on a STM32 board and I'm confused on how to use my peripherals : polling, interrupt, DMA, DMA interrupt...
Actually, I coded an UART module which send basics data and it works in polling, interrupt and DMA mode.
But I'd like to be able to send and receive specific frames with variable lengths, for example:
[ START | LGTH | CMD_ID | DATA(LGTH) | CRC ]
I also have sensors and I'd like to interact received DATA in these UART frames with sensors.
So, what I don't understand is:
how to program the UART module to work in "frame" mode? (buffer? circular DMA? interrupt? where, when..)
when I'm able to send or receive frame with my UART, what is the best way to interact with sensors? (inside a timer interrupt? in a state machine ? with extern variable? ...)
Here is my Libraries tree
In future, the idea is to carry this application in freertos
Thank you!
Absolutelly in DMA when it is available.
You have one big (good solution is cyclic) buffer and you just write data from one side. If DMA does not already work, you start the DMA with your buffer.
If DMA works, you just write your data to buffer and you wait DMA transfer complete interrupt.
Later in this interrupt you increase read pointer of buffer (as you sent some data already) and check if any data available to send over DMA. Set memory address to DMA and number of bytes in buffer to send.
Again, when DMA TC IRQ happens, do process again.
There is no support for FRAME, but only in plain bytes. It means you have to "invent" your own frame protocol and use it in app.
Later, when you want to send that FRAME over UART, you have to:
Write start byte to buffer
Write other header bytes
Write actual data
Write stop bytes/CRC/whatever
Check if DMA does not work, if it does not, start it.
Normally, I use this frame concept:
[START, ADDRESS, CMD, LEN, DATA, CRC, STOP]
START: Start byte indicating start of frame
ADDRESS: Address of device when multiple devices are in use on bus
CMD: Command ID
LEN: 2 bytes for data length
DATA: Actual data in bytes of variable length
CRC: 2 bytes for CRC including: address, cmd, len, data
STOP: Stop byte indicating end of frame
This is how I do it in every project where necessary. This does not use CPU to send data, just sets DMA and starts transmission.
From app perspective, you just have to create send_send(data, len) function which will create frame and put it to buffer for transmission.
Buffer size must be big enough to fit your requirements:
How much data at particular time (is it continues or a lot of data at small time)
UART baudrate
For specific question, ask and maybe I can provide some code examples from my libraries as reference.
In this case, where you need to implement that protocol, I would probably use plain interrupts and, in the handler, use a byte-by-byte state-machine to parse the incoming bytes into a frame buffer.
Only when a complete, valid frame has been received is it necessary to signal some semaphore/event and requuest a scheduler run, otherwise, you can handle any protocol error as you require - maybe tx some 'error-repeat' message and reset the state-machine to await the nexx start-of-frame buyte.
If you use DMA for this, then the variable frame-length is going to be awkward and you STILL have to iterate the received data to validate you protocol:(
DMA doesn't sound like a good fit for this, to me...
EDIT: if no preemptive multitasker, then forget about all that semaphore gunge above:) Still, it's easier to check a boolean 'validFrameRx' flag than parse DMA block data.

How long is a serial buffer in linux?

My question regards <termios.h>. As I understand, two buffers exist in reading something over a UART - a hardware buffer where received bytes are stored, and a software buffer where we load the stuff that has been stored in the hardware buffer. This software buffer is the second argument in read(uart_channel, BUFFER, length) as I understand.
Please explain: how long is the hardware buffer? Do I have control over how long it is? For me it is critical to read the 12 most recent bytes sent over UART by a device - how can I ensure this?
I had a similar situation once and what I did is to create a thread that kept on reading the UART (blocking read) and I used a FIFO between the threads.
If you cannot use threading, you might just use select.
Most uC I've seen have a hardware FIFO that can be set to interrupt after, say [1,2,4,8,16] bytes. If the FIFO is left 'partially full' for some small multiple of the character interval for the currently configured baud rate, the UART interrupts anyway. If you really must have access to bytes ASAP, then you need to set the FIFO 'length' to 1. Of course, your driver should do that when initializing the UART.
Failing that, I guess you could poll it:(

Ownership of frame in buffer - C programming

I am programming interface between HW and SW. I know what should I get as result, and now I am thinking how to make it most efficiently. I have sort of circular FIFO buffer in which Operating System will write data, and HW will read data from it. So basically I have read and write pointer, read is shifted when DMAC (DMA controller is reading data from memory) and write is shifted when my program is writing to memory. Basic blocks in this circular FIFO buffer are called frames (I call them that way). So I am always reading and writing to frames in buffer. Now I am wondering is it possible to indicate who owns frame (HW or SW)? I have idea to put sort of flag at beginning of every frame to indicate is frame owned by HW or SW. But I do not know should I do it on that way, or there is better way to do it in C??? For example at beginning all frames in buffer are owned by OS (SW), then when my program completes writing to first frame, I am passing ownership to HW (or my DMA Controller). Again, when DMA Controller completes reading from memory, I am passing ownership of frame to OS. So I have one way to do this with flags at beginning of every frame, but I am wondering is there better way to do it?
Thank you in advance on answers :)
What I did in the past was to pass the pointer to the DMA driver whenever it's done. The driver switch to the new pointer on next clock cycle.
The DMA driver is tied to a display sync signal at 60Hz, while the application only updates the pointer at about 10Hz, but it doesn't hurt to display the old image while waiting for a new one.
I'm not sure if this fits your problem.
What I ususally do is queue pointers or frame indices, rather than actual I/O data. An index into a frame array only needs to be one byte, and so is easily queued/manipulated. I put most of the indices onto a user-state pool queue at startup and the rest into a 'rxPool' queue for the rx driver/s to draw from for new rx data. Each frame has a 2-bit status field that indicates its current usage state, (in user pool, in rxPool, holding tx data, holding rx data).
I queue the indices into the DMA/interrupt driver and fire it off, (if not already running). When the DMA/whatever is done, I queue the tx/rx indices back to a 'scavenge' queue and signal a semaphore. The I/O driver thread, (waits on the semaphore), then signals the I/O originator thread, (new rx frame), release the index back to a 'pool' queue, ready for re-use, (used tx frame), or 'top-up the rxPool queue if not full.

Data structure for storing serial port data in firmware

I am sending data from a linux application through serial port to an embedded device.
In the current implementation a byte circular buffer is used in the firmware. (Nothing but an array with a read and write pointer)
As the bytes come in, it is written to the circular bufffer.
Now the PC application appears to be sending the data too fast for the firmware to handle. Bytes are missed resulting in the firmware returning WRONG_INPUT too mant times.
I think baud rate (115200) is not the issue. A more efficient data structure at the firmware side might help. Any suggestions on choice of data structure?
A circular buffer is the best answer. It is the easiest way to model a hardware FIFO in pure software.
The real issue is likely to be either the way you are collecting bytes from the UART to put in the buffer, or overflow of that buffer.
At 115200 baud with the usual 1 start bit, 1 stop bit and 8 data bits, you can see as many as 11520 bytes per second arrive at that port. That gives you an average of just about 86.8 µs per byte to work with. In a PC, that will seem like a lot of time, but in a small microprocessor, it might not be all that many total instructions or in some cases very many I/O register accesses. If you overfill your buffer because bytes are arriving on average faster than you can consume them, then you will have errors.
Some general advice:
Don't do polled I/O.
Do use a Rx Ready interrupt.
Enable the receive FIFO, if available.
Empty the FIFO completely in the interrupt handler.
Make the ring buffer large enough.
Consider flow control.
Sizing your ring buffer large enough to hold a complete message is important. If your protocol has known limits on the message size, then you can use the higher levels of your protocol to do flow control and survive without the pains of getting XON/XOFF flow to work right in all of the edge cases, or RTS/CTS to work as expected in both ends of the wire which can be nearly as hairy.
If you can't make the ring buffer that large, then you will need some kind of flow control.
There is nothing better than a circular buffer.
You could use a slower baud rate or speed up the application in the firmware so that it can handle data coming at full speed.
If the output of the PC is in bursts it may help to make the buffer big enough to handle one burst.
The last option is to implement some form of flow control.
What do you mean by embedded device ? I think most of current DSP and processor can easily handle this kind of load. The problem is not with the circular buffer, but how do you collect bytes from the serial port.
Does your UART have a hardware fifo ? If yes, then you should enable it. If you have an interrupt per byte, you can quickly get into trouble, especially if you are working with an OS or with virtual memory, where the IRQ cost can be quit high.
If your receiving firmware is very simple (no multitasking), and you don't have an hardware fifo, polled mode can be a better solution than interrupt driven, because then your processor is doing only UART data reception, and you have no interrupt overhead.
Another problem might be with the transfer protocol. For example if you have long packet of data that you have to checksum, and you do the whole checksum at the end of the packet, then all the processing time of the packet is at the end of it, and that is why you may miss the beginning of the next packet.
So circular buffer is fine and you have to way to improve :
- The way you interact with the hardware
- The protocol (packet length, acknoledgment etc ...)
Before trying to solve the problem, first you need to establish what the problem really is. Otherwise you might waste time trying to fix something that isn't actually broken.
Without knowing more about your set-up it's hard to give more specific advice. But you should investigate further to establish what exactly the hardware and software is currently doing when the bytes come in, and then what is the weak point where they're going missing.
A circular buffer with Interrupt driven IO will work on the smallest and slowest of embedded targets.
First try it at the lowest baud rate and only then try at high speeds.
Using a circular buffer in conjunction with IRQ is an excellent suggestion. If your processor generates an interrupt each time a byte is received take that byte and store it in the buffer. How you decide to empty that buffer depends on if you are processing a stream of data or data packets. If you are processing a stream simply have your background process remove the bytes from the buffer and process them first-in-first-out. If you are processing packets then just keep filing the buffer until you have a complete packet. I've used the packet method successfully many times in the past. I would implement some type of flow control as well to signal to the PC if something went wrong like a full buffer or if packet-processing time is long to indicate to the PC when it is ready for the next packet.
You could implement something like IP datagram which contains data length, id, and checksum.
Edit:
Then you could hard-code some fixed length for the packets, for example 1024 byte or whatever that makes sense for the device. PC side would then check if the queue is full at the device every time it writes in a packet. Firmware side would run checksum to see if all data is valid, and read up till the data length.

Resources