PWM DMA to a whole GPIO

PWM DMA to a whole GPIO - arm

I've got an STM32F4, and I want to PWM a GPIO port that's been OR'd with a mask..
So, maybe we want to PWM 0b00100010 for awhile at 200khz, but then, 10khz later, we now want to PWM 0b00010001...then, 10kHz later, we want to PWM some other mask on the same GPIO.
My question is, how do you do this with DMA? I'm trying to trigger a DMA transfer that will set all the bits on a rising edge, and then another DMA transfer that will clear all the bits on a falling edge.
I haven't found a good way to do this, (at least with CubeMX and my limited experience with C & STM32's) as it looks like I only get a chance to do something on a rising edge.
One of my primary concerns is CPU time, because although I mention hundreds of kilohertz in the above example, I'd like to make this framework very robust in-so-far as it isn't going to be wasteful of CPU resources...That's why I like the DMA idea, since it's dedicated hardware doing the mindless lifting of a word here to a word there type of stuff, and the CPU can do other things like crunch numbers for a PID or something.
Edit
For clarity : I have a set of 6 values that I could write to a GPIO. These are stored in an array.
What I'm trying to do is set up a PWM timer to set the GPIO during the positive width of the PWM and then I want the GPIO to be set to 0b00000000 during the low period width if the pwm.
So, I need to see when the rising edge is, quickly write to the gpio, then see when the falling edge is, and write 0 to the gpio.

Limited solution without DMA
STM32F4 controllers have 12 timers with up to 4 PWM channels each, 32 in total. Some of them can be synchronized to start together, e.g. you can have TIM1 starting TIM2, TIM3, TIM4 and TIM8 simultaneously. That's 20 synchronized PWM outputs. If it's not enough, you can form chains where a slave timer is a master to another, but it'd be quite tricky to keep all of them perfectly synchronized. Not so tricky, if an offset of a few clock cycles is acceptable.
There are several examples in the STM32CubeF4 library example projects section, from which you can puzzle together your setup, look in Projects/*_EVAL/Examples/TIM/*Synchro*.
General solution
A general purpose or an advanced timer (that's all of them except TIM6 and TIM7) can trigger a DMA transfer when the counter reaches the reload value (update event) and when the counter equals any of the compare values (capture/compare event).
The idea is to let DMA write the desired bit pattern to the low (set) half of BSRR on a compare event, and the same bits to the high (reset) half of BSRR on an update event.
There is a problem though, that DMA1 cannot access the AHB bus at all (see Fig. 1 or 2 in the Reference Manual), to which the GPIO registers are connected. Therefore we must use DMA2, and that leaves us with the advanced timers TIM1 or TIM8. Things are further complicated because DMA requests caused by update and compare events from these timers end up on different DMA streams (see Table 43 in the RM). To make it somewhat simpler, we can use DMA 2, Stream 6 or Stream 2, Channel 0, which combine events from 3 timer channels. Instead of using the update event, we can set the compare register on one timer channel to 0.
Set up the DMA stream of the selected timer to
channel 0
single transfer (no burst)
memory data size 16 bit
peripheral data size 16 bit
no memory increment
peripheral address increment
circular mode
memory to peripheral
peripheral flow controller: I don't know, experiment
number of data items 2
peripheral address GPIOx->BSRR
memory address points to the output bit pattern
direct mode
at last, enable the channel.
Now, set up the timer
set the prescaler and generate an update event if required
set the auto reload value to achieve the required frequency
set the compare value of Channel 1 to 0
set the compare value of Channel 2 to the required duty cycle
enable DMA request for both channels
enable compare output on both channels
enable the counter
This way you can control 16 pins with each timer, 32 if using both of them in master-slave mode.
To control even more pins (up to 64) at once, configure the additional DMA streams for channel 4 compare and timer update events, set the number of data items to 1, and use ((uint32_t)&GPIOx->BSRR)+2 as the peripheral address for the update stream.
Channels 2 and 4 can be used as regular PWM outputs, giving you 4 more pins. Maybe Channel 3 too.
You can still use TIM2, TIM3, TIM4, and TIM5 (each can be slaved to TIM1 or TIM8) for 16 more PWM outputs as described in the first part of my post. Maybe TIM9 and TIM12 too, for 4 more, if you can find a suitable master-slave setup.
That's 90 pins toggling at once. Watch out for total current limits.

what PWM 0b00100010 means? PWM is a square wave with some duty ratio. it wil be very difficult to archive using DMA but you will need to have table with already calculated values. For example to have 2kHz PWM with 10% ratio you will need to have 10 samples one with bit set, nine with bit zeroed. You configure the timer to 20k / sec trigger mem-to-mem (GPIO has to be done this way) DMA transmission in the circular mode. On the pin you will have 2kHz 10% wave. The PWM resolution will be 10%. If you want to make it 0.5% you will need 200 samples table and DMA triggered 400k times per second.
IMO it is better to use timer and DMA to load new values to it (read about the burst DMA mode in the timer documentation in the Reference Manual)

Related

STM32: How to configure timer to trigger interrupt in every increments in quadrature encoder mode?

I have a rotary encoder with STM32F4 and configured TIM4 in "Encoder Mode TI1 and TI2". I want to have an interrupt every time the value of timer is incremented or decremented.
The counting works but I only can configure an interrupt on every update event, not every changes in TIM4->cnt. How can I do this?
In other words: My MCU+Encoder in quadrature mode could count from 0 to 99 in one revolution. I want to have 100 interrupts in the revolution but if I set TIM4->PSC=0 and TIM4->ARR=1, results 50 UPDATE_EVENTs, so I should set ARR=0 but it does not work. How can I sole that?

To get 100 interrupts per revolution keep PSC=0, ARR=1, setup the two timer channels in output compare mode with compare values 0 and 1 and interrupts on both channels.
Or even use ARR=3 and setup all four channels, with compare values of 0,1,2 and 3. This will allow to detect the direction.

Normally, the whole point of using the quadrature encoder mode is counting the pulses while avoiding interrupts. You can simply poll the counter register periodically to determine speed and position.
Getting interrupts on every encoder pulse is extremely inefficient, especially with high resolution encoders. Yours seems to be a low resolution one. If you still think you need them for some reason, you can connect A & B into external interrupts and implement the counting logic manually. In this case, you don't need quadrature encoder mode.

How reliable is DMA to GPIO on STM32 MCUs?

ST has some application notes that talk about emulating a parallel bus using DMA to GPIO. I appreciate that, but it doesn't answer important questions. I am looking through the reference manual, and I can't seem to find clarify the things that I am concerned about.
I am most concerned about the jitter. The reference manual repeatedly states, that when DMA is triggered (e.g., by a timer), the DMA controller will read the memory and transfer the value to the peripheral. That might be fine with peripherals that have their own FIFO. There, when space is available in the FIFO, DMA is triggered and fills the FIFO. That will probably happen before the FIFO runs empty.
But with GPIO, if the DMA channels doesn't have a FIFO itself, the data will not be ready when the timer triggers and it needs to be fetched from SRAM. So between the timer triggering and between the value actually arriving in the GPIO output register, some time may pass. This might be measurable when looking at the clock output by the timer and the GPIO pins. The DMA controller has to compete for access to the SRAM with the running program, so certain activities by the program may increase the jitter.
Maybe that is a colossal oversight on my part, but ST's reference manual doesn't seem mention a FIFO as part of the DMA. If that is the case, that would result in jitter which may impact performance at higher frequencies.
I need to toggle 3 to 4 pins synchronously to a clock from 100kHz to 1MHz. I am considering DMA to GPIO and also abusing a QuadSPI controller. I am currently testing on a STM32L4 but I'm also considering STM32F4 or even F1.

DMA to/from GPIOit is just memory-to-memory transfer. Many STM32 uCs have built in DMA FIFOs - but they will have not use here.
The core has always priority over the DMA so if it can be the issue (very unlikely) place the core accesible data (this data which uC will access when DMA is active in the separate memory area - for example CCM (if your uC has one)
Answering the question
memory to/FROM GPIO is very reliable - I personally did not have any problems with it.

If your clock can be anything between 100 kHz and 1 MHz, I guess you're not worried about jitter in the clock itself, only jitter in the data versus the clock. If your clock need not be continuous, a novel idea then is to do some preprocessing of the data to include the clock signal as part of the GPIO data. Then you could trigger the DMA at regular intervals using a timer, and you'll get the data frequency on the bus at half that rate with perfect alignment between clock and data.
So if you you want to send the four-bit data 5 6 B D with data valid on the positive clock edge, prepare the DMA buffer as so: 05 15 06 16 0B 1B 0D 1D and connect the GPIO pin 4 as the clock. Leave a final byte in the buffer to reset the clock/bus to idle state, if you need.
You can of course extend the idea and incorporate control signals such as chip selects and tri-state signals for external buffers, if needed.
Also take note that not all DMA blocks may have access to the AHB bus which is holding the GPIO registers. For example on STM32F40x, only DMA2 can be used (this is what got me, until I read this answer https://stackoverflow.com/a/46619315/6552613).

I haven't fully explored this space yet, but, by disabling interrupts and polling for interrupt flags in my main loop, it's made the jitter on my GPIO DMA basically disappear! Granted it might just be the set of interrupts have enabled, but everything down to the systick timer was killing me. By polling the interrupts in the main loop it seems to have fixed my issue.
Note that this is on an STM32F042, and I never exceed 6 MHz for my period. When I try to, i.e. try to go to 8 MHz sampling out, everything falls apart. YMMV

Raspberry: how does the PWM via DMA work?

I read that the driver for "Software PWM" is running somehow on the PWM-HW and acessing all GPIOs without using the CPU. Can someone explain how that works? Is there a second processor in the Raspberry Pi used for PWM and PCM module(is there a diagram for the blocks)?
The question is related to this excellent driver which I used a lot in my robots.
Here is the explanation, which I unfortunately don't understand...
The driver works by setting up a linked list of DMA control blocks with the
last one linked back to the first, so once initialised the DMA controller
cycles round continuously and the driver does not need to get involved except
when a pulse width needs to be changed. For a given period there are two DMA
control blocks; the first transfers a single word to the GPIO 'clear output'
register, while the second transfers some number of words to the PWM FIFO to
generate the required pulse width time. In addition, interspersed with these
control blocks is one for each configured servo which is used to set an output.
While the driver does use the PWM peripheral, it only uses it to pace the DMA
transfers, so as to generate accurate delays."
Is the following understanding right:
The DMA controller is like a second processor. You can run code on it. So it is used here to control all the Raspberry GPIO pins high/low states together with the PWM block. DMA Controller does this continously. There are probably more than one DMA controller in the Raspberry, so the speed of the OS Linux is not influenced much due to one missing DMA controller.
I don't understand how exactly DMA and PWM work together.

I recommend reading RPIO source code together with ServoBlaster's, as it's slightly simplified and can help understanding. Also very important: Broadcom's BCM2835 manual which contains all the tiny details.
is there a diagram for the blocks
The manual contains all the functionalities offered by the chip (not in a block diagram though, as far as I’ve seen).
Is the following understanding right:
The DMA controller is part of the main chip (Broadcom, although I think the same happens on desktop CPUs). It can't exactly run code, but it can copy memory across peripherals by itself, without consuming the main processor’s time. The DMA controller has different channels which can copy memory independently and runs independently of the CPU.
It is configurable via "control blocks" (BCM manual page 40, 4.2.1.1): you can tell the DMA controller to first copy memory from A to B, then from C to D and so on.
don't understand how exactly DMA and PWM work together
DMA is used to send data to the PWM controller ("Pulse Width Modulator", BCM manual page 138, chap. 9), which consumes the data and this creates a very precise delay. Interestingly, the PWM controller is... not used to generate any PWM pulse, but just to wait.
Can someone explain how that works?
Ultimately, you configure the value of the GPIO pins (or the settings of the PWM or PCM generator), by setting memory at a special address; the memory in that region represents the peripheral configuration (BCM manual page 89, chapter 6).
So the idea is: copy 1 onto the memory that controls the GPIO pin value, using the DMA controller; wait the pulse width; copy 0 onto the GPIO pin value; wait the remaining part of the period; loop. Since the DMA controller does it, it doesn't consume CPU cycles.
The key point here is being able to make the DMA controller "wait" an exact amount of time, and for this, RPIO and ServoBlaster use the PWM controller in FIFO mode (the PCM generator also has such functionality, but let's stick to PWM). This means that the PWM controller will "send" the data it reads from its so-called FIFO queue, and then stop. It doesn't matter how it's "sent" (BCM manual page 139, 9.4 MSENi=0), the key point is that it requires a fixed amount of time. As a matter of fact, it doesn't even matter which data is sent: the DMA controller is configured to write into the FIFO queue and then wait until the PWM controller has finished sending data, and this creates a very precise delay.
The resolution of the resulting pulse is given by the duration of the PWM transfer, which depends on the frequency at which the PWM controller is running.
Example
We have a maximum resolution of 1ms (given by the PWM delay), and we want to have a pulse of 25% duty cycle with frequency 125Hz. The period of a pulse is thus 8ms. The DMA operation performed will be
Set pin to 1 (DMA write to GPIO mem)
Wait 1ms (DMA write to PWM FIFO)
Wait 1ms (DMA write to PWM FIFO)
Set the pin to 0 (DMA write to GPIO mem)
Wait 1ms (DMA write to PWM FIFO)
...repeat "Wait 1ms" 4 more times.
Wait 1ms (DMA write to PWM FIFO) and jump back to 1.
This will thus require at least 10 DMA control blocks (8 wait instructions, given by period / delay plus 2 write operations).
Note: in ServoBlaster and RPIO, it will consume exactly 16 DMA control blocks, because (for higher precision), they always perform a "memory copy" operation before a "wait operation". The "memory copy" operation is just a dummy unless it needs to change the pin value.

STM32 - How to trigger interrupt after a certain PWM ON time?

I'm new to ARM MCUs (STM32F411), and I have been trying to find my way around the peripherals using STM's HAL library and STM32Cube.
I've already configured my board in order to use some peripherals:
Timer 2 for running an interrupt with a certain frequency
Timer 3 for running PWMs on 3 channels of it.
ADC with 4 channels, into DMA mode, for reading some analog input.
Let us suppose, now, that the PWM's whole period is 100 ms and its duty cycle is 50% (50 ms PWM on and 50 ms PWM off).
I would like to trigger an interrupt after a certain time of the PWM on level, let us say 50% of it.
Hence, I would like to run an interrupt at 25 ms in order to use the ADC for sampling it's analog inputs.
Do you have any suggestion on how could I implement such a kind of interrupt?
Thank you in advance for your help!

Since the ADC of the STM32F411 is used in Regular mode (not Injected mode) and only three channels out of four are used to generate PWM on Timer 3, the fourth channel can be used to trigger the ADC.
Hence Timer 3 is configured as follows:
CH1 used for Output Compare mode 0 (TIM3->CCMR1.OC1M = 0)
CH2, CH3, CH4 used for PWM outputs
Therefore TIM3->CCR1 is loaded to a value that gives 25% of duty, then it will generate TIM3_CH1 events that can be used to trigger ADC start-of-conversion at 25% of your TIM3 timebase.

Generate/ output clock pulse ( C code )

Im using Ethernut 2.1 B and I need a C program that outputs a clock signal at the timer 1 output B, with other words on output OCIB. The frequency of the clock signal should be at 1.0 kHz.
Anyone know how this could be done?

You need to look in COM bits for your timer. For instance, for Timer0 (8-bit), the COM bits are set in the TCCR0 register. Probably the setting you'd be interested in is
TCCR0 |= (0<<COM1)|1<<COM0); // Toggle OC0 on compare match
This will toggle the OC0 (pin14) line when timer reaches the specified value.
Which timer you use depends on the precision you need: obviosely the 16-bit timers can give you more precise time resolution then the 8-bit timers.
The setting of the registers for your specific frequency (1Khz) depends on the clock speed of your chip, and which timer you are using: the timers use a pre-scaled general clock signal (see table 56 of the datasheet for possible values). This means that the prescaler settings will depend on your clock speed, and how high you want to count. For most precision you will want to count as high as possible, which means the lowest possible prescaler setting compatible with your timer's maximum value.
As far as where to start, generally, reading the datasheet is a good place, but googling "AVR timer" can also be very helpful.

It seems to be based on the Atmel ATmega 128, so read that CPU's data sheet to figure out how to program the timer hardware.
Not sure if this microcontroller supports directly driving an output from a timer, if it doesn't you're going to have to do it in software from the interrupt service routine.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight