Why is there a timing difference in periodically interrupt? - c

I am writing a low level driver for a type of one line communication protocol. This line is connected to both Tx pin and Rx pin on a STM32F0 micro running internal clock at 8Mhz. The Tx pin state is set in a timer interrupt, and the Rx pin is read in a external GPIO interrupt.
For testing, I toggle the Tx pin at 416µs (auto reload value is 3333 with no prescaler), and in the GPIO interrupt I read the timing difference between 2 consecutive interrupts. The measured time are roughly 500µs from "High To Low" transition interrupt to "Low To High" transition interrupt and 300µs from "Low To High" transition interrupt to "High To Low" transition interrupt. Why is there such a difference? And how to get rid of it?
I have checked the signal on the scope and it's a perfect square wave with pulse width of 416µs. I also use htim->Instance->CNT = 0; and time = htim->Instance->CNT; to wrap different parts of the code to find where the difference comes from but no avail.
Here are the interrupt handles, the measured time is saved in tim3_value variable:
void TIM2_IRQHandler(void)
HAL_GPIO_TogglePin(TX_GPIO_Port, TX_Pin);
htim2.Instance->ARR = 3333;
void EXTI4_15_IRQHandler(void)
if(__HAL_GPIO_EXTI_GET_IT(RX_Pin) != 0x00u)
tim3_value = htim3.Instance->CNT;
htim3.Instance->CNT = 0;

STM32 timers have ARR preloaded. It means that it will change the actual value of the interal ARR register on the update event. If you want to change it at particular moment you need to generate this event yourself by writing 1 to the UG bit in the EGR register.
I strongly advice to read carefully the STM32 Reference Manual as magical HAL functions are not enough
I would not do it in the interrupt anyway. STM32 times have a mechanism called "direct transfer mode". It uses DMA to load the value(s) of the timer register on the chosen event. You just need to prepare data for it and on the update event ARR will be loaded from the memory autoatically.

There isn't enough information to check for your issue.
In general, there are some cause that might impact to your measurement results:
Interrupt latency: for STM32F0 it is ~16 cycles. Additionally, if there are other interrupt it might add delay up to few us. However, seem that this is not your case since you mentioned that tolerances are somehow constant.
Physical property of input and output pin. This depends on the configuration of GPIO as well as the connection between Tx pin and Rx pin. In some cases, this impact a lot to the measurement result. Please try to measure it with an oscilloscope.
For better measurement, it is also recommend to use the timer module for stable result. The timer also provide "Input Capture" feature. This approach is better because it could remove impact of interrupt latency/priority.


Clearing or preventing pending interrupts in an ISR

An ISR necessarily causes its own trigger pin to toggle randomly multiple times. These toggles (during the ISR) should be ignored, but aren't, and result in another interrupt to be set as pending and get executed afterwards.
I have a serial bit-bang device that I read via an interrupt. Device has a data pin and a "clock" pin. Data pin is HIGH by default. When the device is ready to be read, it pulls this data pin LOW. After this falling edge, each pulse on the "clock" pin shifts one bit out to the data pin.
An interrupt triggers on the falling edge of the data pin, and ISR bangs 24 bits of data out of the same data pin. Therefore, additional random falling edges on the data pin causes another interrupt to be set as pending. Which triggers immediately after the actual ISR has returned, resulting two consecutive interrupts being run per one "real" interrupt.
I have tried multiple ways to disable interrupts and/or clear pending interrupts, none of which seem to have any effect whatsoever. I suspect that this is because manipulating interrupt related registers is not allowed/or ignored in an ISR.
The device is Atmel ATSAMD21. (ARM Cortex M0+). Code is built under Atmel Studio with optimisation level -Og.I am okay with using ASF and/or SAM libraries/definitions, ARM CMSIS or baremetal register manipulation. Whichever happens to work.
Here is what I tried so far:
void interrupt_cb ( void )
// Trying to disable interrupts
// Executed at the beginning of the ISR
ext_irq_disable( <pin> );
// body
// < code that results in same pin >
// < that the interrupt is triggered >
// < to be toggled randomly. >
// Trying to clear pending interrupts
// Executed just before the ISR returns.
NVIC_ClearPendingIRQ( EIC_IRQn );
NVIC->ICPR[0] |= 4; //probably same as the above
Or a combination of these commands.

STM32 DMA from timer count to memory

I'm using an STM32H743. I have an external clock signal coming in on a GPIO pin, and I want to very accurately measure elapsed time between each rising (or falling) edge in the external clock signal. So I set things up so that TIM4 is triggered by the external clock, and TIM5 is triggered by the internal oscillator.
I wrote an IRQ so that whenever TIM4 triggers, an interrupt runs that captures TIM5's value. It seems to work OK, but I'm wondering if I can do it through DMA to avoid all the context switching and free up the CPU. Basically I want to set up a DMA so that each TIM4 event initiates a DMA transfer that copies the TIM5 counter value to a circular buffer somewhere.
I've searched through forums and the DMA documentation but I'm hazy on whether a timer register can be a valid DMA source. I was thinking maybe I could do something like this:
hDma->PAR = (uint32_t) &htim5.Instance->CNT;
hDma->M0AR = (uint32_t) myBufferPtr;
hDma->NDTR = myBufferSize;
hDma->CR |= (uint32_t)DMA_SxCR_EN;
But I'm not sure if this can work.
Short version: Can I use the timer's CNT register as a DMA transfer source? Would it be a peripheral-to-memory transfer? Or a memory-to-memory transfer? Are there other flags I need to make this work? Or is it not possible? Or is there another STM32 feature that would make it easier to count time between pulses?
I must confess that my long practical experience with STM32 by now stayed with mainstream controller families like STM32F0, STM32F3, STM32F4 and STM32L4.
Therefore I'm answering based on what those controllers would offer you in your situation.
The STM32H7 series is much stronger, let alone it offers several additional DMA technologies like DMA2D, MDMA and lots of other stuff that I'm not sure about.
But I think a simplified answer might also help you for now, so I'm daring to write it.
Can I use the timer's CNT register as a DMA transfer source? Would it be a peripheral-to-memory transfer? Or a memory-to-memory transfer? Are there other flags I need to make this work? Or is it not possible?
I would expect this to work.
I don't see a reason not to read the TIMx_CNT register in a DMA transfer.
The CNT register is definitely a peripheral address so you have to configure it as a peripheral-to-memory transfer.
I believe that the peripheral/memory separation refers to the bus from which the DMA controller fetches the data (or to which bus one it delivers them) in the bus matrix implemented in every STM32.
Or is there another STM32 feature that would make it easier to count time between pulses?
Yes, there is:
Many of the TIM peripherals (not all are the same) offer you a feature called "Input Capture" that connects the channel (sub-)peripheral of the TIM instance to the input and has the main part of the (same!) TIM peripheral do the internal clocking.
A prerequisite of this is, that the pin you'd like to measure has a TIMx_CHy alternate function, not "only" a TIMx_ETR one.
The TIM peripherals offer a wealthy range of different configuration options - and a complicated mess as long as you haven't got used to it.
As an introduction and a good overview, I recommend two application notes from ST:
AN4013 Application note. "STM32 cross-series timer overview", Rev.8
Which timers you have on your µC, and which features are offered by which one.
AN4776 Application note. "General-purpose timer cookbook for STM32 microcontrollers", Rev.3
How to use the timers you have. Check out section 2.6, input capture is on page 27.
Looking up those two, I found a third one you might want to check out for better precision, related to HRTIM timers:
AN4539 Application note. "HRTIM cookbook", Rev.4
It is easily done using STM32CubeIDE configurator:
configure timer, enable input capture channel, enable DMA (mode
circular, peripheral to memory,data width word/word). Enable
Prepare buffer for storing captured counter values
Start IC in DMA mode before main loop
For high speed operation you may copy data from timerCaptureBuffer
to timerCaptureBufferSafe inside these callbacks. For example, DMA memory to memory transfer to minimize time spent in HAL_TIM_IC_CaptureHalfCpltCallback and HAL_TIM_IC_CaptureCallback interrupts. Process adjacent captured values stored in timerCaptureBufferSafe after DMA memory to memory callback signals data is ready. You may use signaling flags so timerCaptureBufferSafe will not be overwritten.
Here is an example:
#define TIM_BUFFER_SIZE 128
uint32_t timerCaptureBuffer[TIM_BUFFER_SIZE];
uint32_t timerCaptureBufferSafe[TIM_BUFFER_SIZE];
// ...
// ...
HAL_TIM_IC_Start_DMA(&htim2, TIM_CHANNEL_1, uint32_t*)timerCaptureBuffer,TIM_BUFFER_SIZE);
// ...
void HAL_TIM_IC_CaptureHalfCpltCallback(TIM_HandleTypeDef *htim)
// ...
void HAL_TIM_IC_CaptureCallback(TIM_HandleTypeDef *htim)
// ...
void myDMA_Callback22(DMA_HandleTypeDef *_hdma)

How reliable is DMA to GPIO on STM32 MCUs?

ST has some application notes that talk about emulating a parallel bus using DMA to GPIO. I appreciate that, but it doesn't answer important questions. I am looking through the reference manual, and I can't seem to find clarify the things that I am concerned about.
I am most concerned about the jitter. The reference manual repeatedly states, that when DMA is triggered (e.g., by a timer), the DMA controller will read the memory and transfer the value to the peripheral. That might be fine with peripherals that have their own FIFO. There, when space is available in the FIFO, DMA is triggered and fills the FIFO. That will probably happen before the FIFO runs empty.
But with GPIO, if the DMA channels doesn't have a FIFO itself, the data will not be ready when the timer triggers and it needs to be fetched from SRAM. So between the timer triggering and between the value actually arriving in the GPIO output register, some time may pass. This might be measurable when looking at the clock output by the timer and the GPIO pins. The DMA controller has to compete for access to the SRAM with the running program, so certain activities by the program may increase the jitter.
Maybe that is a colossal oversight on my part, but ST's reference manual doesn't seem mention a FIFO as part of the DMA. If that is the case, that would result in jitter which may impact performance at higher frequencies.
I need to toggle 3 to 4 pins synchronously to a clock from 100kHz to 1MHz. I am considering DMA to GPIO and also abusing a QuadSPI controller. I am currently testing on a STM32L4 but I'm also considering STM32F4 or even F1.
DMA to/from GPIOit is just memory-to-memory transfer. Many STM32 uCs have built in DMA FIFOs - but they will have not use here.
The core has always priority over the DMA so if it can be the issue (very unlikely) place the core accesible data (this data which uC will access when DMA is active in the separate memory area - for example CCM (if your uC has one)
Answering the question
memory to/FROM GPIO is very reliable - I personally did not have any problems with it.
If your clock can be anything between 100 kHz and 1 MHz, I guess you're not worried about jitter in the clock itself, only jitter in the data versus the clock. If your clock need not be continuous, a novel idea then is to do some preprocessing of the data to include the clock signal as part of the GPIO data. Then you could trigger the DMA at regular intervals using a timer, and you'll get the data frequency on the bus at half that rate with perfect alignment between clock and data.
So if you you want to send the four-bit data 5 6 B D with data valid on the positive clock edge, prepare the DMA buffer as so: 05 15 06 16 0B 1B 0D 1D and connect the GPIO pin 4 as the clock. Leave a final byte in the buffer to reset the clock/bus to idle state, if you need.
You can of course extend the idea and incorporate control signals such as chip selects and tri-state signals for external buffers, if needed.
Also take note that not all DMA blocks may have access to the AHB bus which is holding the GPIO registers. For example on STM32F40x, only DMA2 can be used (this is what got me, until I read this answer https://stackoverflow.com/a/46619315/6552613).
I haven't fully explored this space yet, but, by disabling interrupts and polling for interrupt flags in my main loop, it's made the jitter on my GPIO DMA basically disappear! Granted it might just be the set of interrupts have enabled, but everything down to the systick timer was killing me. By polling the interrupts in the main loop it seems to have fixed my issue.
Note that this is on an STM32F042, and I never exceed 6 MHz for my period. When I try to, i.e. try to go to 8 MHz sampling out, everything falls apart. YMMV

Raspberry: how does the PWM via DMA work?

I read that the driver for "Software PWM" is running somehow on the PWM-HW and acessing all GPIOs without using the CPU. Can someone explain how that works? Is there a second processor in the Raspberry Pi used for PWM and PCM module(is there a diagram for the blocks)?
The question is related to this excellent driver which I used a lot in my robots.
Here is the explanation, which I unfortunately don't understand...
The driver works by setting up a linked list of DMA control blocks with the
last one linked back to the first, so once initialised the DMA controller
cycles round continuously and the driver does not need to get involved except
when a pulse width needs to be changed. For a given period there are two DMA
control blocks; the first transfers a single word to the GPIO 'clear output'
register, while the second transfers some number of words to the PWM FIFO to
generate the required pulse width time. In addition, interspersed with these
control blocks is one for each configured servo which is used to set an output.
While the driver does use the PWM peripheral, it only uses it to pace the DMA
transfers, so as to generate accurate delays."
Is the following understanding right:
The DMA controller is like a second processor. You can run code on it. So it is used here to control all the Raspberry GPIO pins high/low states together with the PWM block. DMA Controller does this continously. There are probably more than one DMA controller in the Raspberry, so the speed of the OS Linux is not influenced much due to one missing DMA controller.
I don't understand how exactly DMA and PWM work together.
I recommend reading RPIO source code together with ServoBlaster's, as it's slightly simplified and can help understanding. Also very important: Broadcom's BCM2835 manual which contains all the tiny details.
is there a diagram for the blocks
The manual contains all the functionalities offered by the chip (not in a block diagram though, as far as I’ve seen).
Is the following understanding right:
The DMA controller is part of the main chip (Broadcom, although I think the same happens on desktop CPUs). It can't exactly run code, but it can copy memory across peripherals by itself, without consuming the main processor’s time. The DMA controller has different channels which can copy memory independently and runs independently of the CPU.
It is configurable via "control blocks" (BCM manual page 40, you can tell the DMA controller to first copy memory from A to B, then from C to D and so on.
don't understand how exactly DMA and PWM work together
DMA is used to send data to the PWM controller ("Pulse Width Modulator", BCM manual page 138, chap. 9), which consumes the data and this creates a very precise delay. Interestingly, the PWM controller is... not used to generate any PWM pulse, but just to wait.
Can someone explain how that works?
Ultimately, you configure the value of the GPIO pins (or the settings of the PWM or PCM generator), by setting memory at a special address; the memory in that region represents the peripheral configuration (BCM manual page 89, chapter 6).
So the idea is: copy 1 onto the memory that controls the GPIO pin value, using the DMA controller; wait the pulse width; copy 0 onto the GPIO pin value; wait the remaining part of the period; loop. Since the DMA controller does it, it doesn't consume CPU cycles.
The key point here is being able to make the DMA controller "wait" an exact amount of time, and for this, RPIO and ServoBlaster use the PWM controller in FIFO mode (the PCM generator also has such functionality, but let's stick to PWM). This means that the PWM controller will "send" the data it reads from its so-called FIFO queue, and then stop. It doesn't matter how it's "sent" (BCM manual page 139, 9.4 MSENi=0), the key point is that it requires a fixed amount of time. As a matter of fact, it doesn't even matter which data is sent: the DMA controller is configured to write into the FIFO queue and then wait until the PWM controller has finished sending data, and this creates a very precise delay.
The resolution of the resulting pulse is given by the duration of the PWM transfer, which depends on the frequency at which the PWM controller is running.
We have a maximum resolution of 1ms (given by the PWM delay), and we want to have a pulse of 25% duty cycle with frequency 125Hz. The period of a pulse is thus 8ms. The DMA operation performed will be
Set pin to 1 (DMA write to GPIO mem)
Wait 1ms (DMA write to PWM FIFO)
Wait 1ms (DMA write to PWM FIFO)
Set the pin to 0 (DMA write to GPIO mem)
Wait 1ms (DMA write to PWM FIFO)
...repeat "Wait 1ms" 4 more times.
Wait 1ms (DMA write to PWM FIFO) and jump back to 1.
This will thus require at least 10 DMA control blocks (8 wait instructions, given by period / delay plus 2 write operations).
Note: in ServoBlaster and RPIO, it will consume exactly 16 DMA control blocks, because (for higher precision), they always perform a "memory copy" operation before a "wait operation". The "memory copy" operation is just a dummy unless it needs to change the pin value.

I2C and timer interrupt (timer1)

I'm trying to read from multiple I2C slave devices using a dsPIC33 microcontroller.
I was hoping someone could advise me on the correct method to user a timer interrupt (in this case timer1) and collecting the I2C data.
So far I can collect data fine from the I2C slave devices by looping in a while loop, but since attempting to add a timer interrupt (so I can apply my own sampling rate rather than 'collect as fast as you can') my I2C software driver is getting stuck.
I've tried with a very low timer speed (1Hz at the moment) and I2C is on the standard 100KHz speed. The PIC is processing at 80MHz.
What is the correct method to use timers and I2C modules? I've had a look online and it seems it could be a matter of interrupt priority as when using timer1 I have an interrupt (I2C) within an interrupt (timer1), although no luck so far.
It seems I managed to solve my own problem, and fairly quickly too.
Turned out it was an interrupt priority problem, I had previously had my timer1 set to priority 7 (highest):
IPC0bits.T1IP = 0b111; // Timer1 Interrupt priority level=7
Changing this to priority 1 solved the problem:
IPC0bits.T1IP = 0b001; // Timer1 Interrupt priority level=1
Hope this helps someone else that comes across this issue, my guess is that the different priorities conflict with the I2C interrupt.
