I am trying to emulate the clock control for STM32 machine with CPU cortex m4. It is provided in the STM32 reference manual the clock supplied to the core is by the HCLK.
The RCC feeds the external clock of the Cortex System Timer (SysTick) with the AHB clock (HCLK) divided by 8. The SysTick can work either with this clock or with the Cortex clock (HCLK), configurable in the SysTick control and status register.
Now Cortex m4 is already emulated by QEMU and I am using the same for STM32 emulation. My confusion is should i supply the clock frequency of "HCLK" I have developed for STM32 to send clock pulses to cortex m4 or cortex -m4 itself manages to have its own clock with HCLK clock frequency 168MHz? or the clock frequency is different ?
If I have to pass this frequency to cortex m4, how do i do that?
QEMU's emulation does not generally try to emulate actual clock lines which send pulses at megahertz rates (this would be incredibly inefficient). Instead when the guest programs a timer device the model of the timer device sets up an internal QEMU timer to fire after the appropriate duration (and the handler for that then raises the interrupt line or does whatever is necessary for emulating the hardware behaviour). The duration is calculated from the values the guest has written to the device registers together with a value for what the clock frequency should be.
QEMU doesn't have any infrastructure for handling things like programmable clock dividers or a "clock tree" that routes clock signals around the SoC (one could be added, but nobody has got around to it yet). Instead timer devices are usually either written with a hard-coded frequency, or may be written to have a QOM property that allows the frequency to be set by the board or SoC model code that creates them.
In particular for the SysTick device in the Cortex-M models the current implementation will program the QEMU timer it uses with durations corresponding to a frequency of:
1MHz, if the guest has set the CLKSOURCE bit to 1 (processor clock)
something which the board model has configured via the 'system_clock_scale' global variable (eg 25MHz for the mps2 boards), if the guest has set CLKSOURCE to 0 (external reference clock)
(The system_clock_scale global should be set to NANOSECONDS_PER_SECOND / clk_frq_in_hz.)
The 1MHz is just a silly hardcoded value that nobody has yet bothered to improve upon, because we haven't run into guest code that cares yet. The system_clock_scale global is clunky but works.
None of this affects the speed of the emulated QEMU CPU (ie how many instructions it executes in a given time period). By default QEMU CPUs will run "as fast as possible". You can use the -icount option to specify that you want the CPU to run at a particular rate relative to real time, which sort of implicitly sets the 'cpu frequency', but this will only sort of roughly set an average -- some instructions will run much faster than others, in a not very predictable way. In general QEMU's philosophy is "run guest code as fast as we can", and we don't make any attempt at anything approaching cycle-accurate or otherwise tightly timed emulation.
Update as of 2020: QEMU now has some API and infrastructure for modelling clock trees, which is documented in docs/devel/clocks.rst in the source tree. This is basically a formalized version of the concepts described above, to make it easier for one device to tell another "my clock rate is 20MHz now" without hacks like the "system_clock_scale" global variable or ad-hoc QOM properties.
Systick is supplied via multiplexer and you can choose the AHB bus clock or divided by 8 system timer clock
An old thread and an oft asked question so this should help some of you trying to emulate cortex systems.
If using a .dtb when booting then in your .dts one can add to the 'timers' block a line of clock-frequency = <value>; and recompile it. This will indeed increase the speed of cortex processors. Clearly, value is some large number.
Related
I working with EFM microcontroller (Silicon Labs)
I need to make a beep every x seconds, when the device in EM3 mode.
I tried so many ways without success.
Please try to help me with code example (I'm HW man, not a SW haha)
Thanks,
Gal.
Refer to page 8 on the datasheet (thanks to #user694733)
EM3 mode description:
still full CPU and RAM retention, as well as Power-on Reset, Pin reset and Brown-out Detection, with a consumption of only 0.6 μA. The low-power ACMP, asynchronous external interrupt, PCNT, and I2C can wake-up the device.
So these are your options. One of these things can wake up the microcontroller. All of them are external inputs. So the microcontroller cannot wake itself up in this mode. This makes sense because all clocks are stopped.
If you had an outside clock connected to the PCNT you could use that to wake it up.
If you want the microcontroller to wake itself up, then you need EM2 mode or less:
In EM2 the high frequency oscillator is turned off, but with the 32.768 kHz
oscillator running, selected low energy peripherals (LCD, RTC, LETIMER,
PCNT, LEUART, I2C, WDOG and ACMP) are still available
In EM2 mode the microcontroller may wake itself up using the RTC (real-time clock), LETIMER (low-energy timer), WDOG (watchdog timer) or PCNT (pulse counter, which can be set to count pulses of the 32.768kHz clock).
The datasheet recommends using the Real-Time Clock or Low Energy Timer (RTC or LETIMER) modules.
... however, if we pay attention, we see the datasheet mentions something called the ULFRCO, Ultra-Low-Frequency RC Oscillator, which runs at approximately 1000 Hz. By searching for the keyword ULFRCO, we see that it does still run in EM3 mode, and it can be used as input for the WDOG. On page 89 we see this listed as a feature of EM3 mode.
So, you may configure the WDOG to reset the system after a few seconds. When the microcontroller resets due to watchdog timeout, it wakes up. You should not be afraid of using a system reset. The RMU_RSTCAUSE allows you to see that the system was reset because of the watchdog timer (not because it was first turned on or the reset pin was used). Memory contents are probably still there, but all peripherals are reset. As long as you can deal with peripherals being reset, you can probably make this work. You might even be able to use a little bit of assembly programming to jump back to exactly the point where the program left off.
We have an application which runs on PIC24H, we would like to port it to another MCU, preferably ARM Cortex. Application is extremely time critical, meaning that we need extremely deterministic code behaviour. In short, there are pulses which are obtained via special hardware to GPIO pins, data is analyzed right away. Processing of data is not complex(we don't need a beefy cpu/mcu to do it). After analyzing the data GPIO output pins are written to their values.
App in 3 short lines:
process input pins
determine pattern within processing of input pins
based on the received pattern write output pins
PIC24H is working at 40MHz, we can toggle the pin in 25ns, we would be grateful with at least 2x speed for future upgrades. So MCU which can run deterministic code and toggle pins with at least 80MHz (12.5ns) would be just fine. We don't need toggling of the pins at constant fast rate, we need a mcu which can toggle it in less than 25ns. We can't waste cycles while toggling, if one cycle is off we loose synchronization. Everything must be done in one cycle precision(or two but constant two cycles), so code should be 100% deterministic.
Please let me know if I'm missing something or if what we need can be done using some other methods on Cortex-M. Just keep in mind that if one cycle is lost(due cache or similar) we loose signal sync and app will not do it's work right or at all.
Thanks!
Br
According to this blog post, the interrupt latency for Cortex-M ranges from 12 to 16 cycles (assuming you are not using FPU registers) with best-case memories. M0 and M0+ are slower than M3/M4/M7. On top of this, you need to add the GPIO access times (and watch out for different clock frequencies between the core and the peripherals. Cortex-M7 will suppport higher clock speeds than M3/M4.
It still isn't clear how many cycles are consumed in recognising a pattern, and how an interrupt is useful in doing this - generally a low latency interface function like this would be an obvious target for dedicated hardware, but since you have an existing software solution it seems the problem is mis-specified.
Providing you avoid accessing any 'slow' peripherals which might stall the bus, the interrupt latency should be deterministic - any specific device should have documentation which covers this.
NXP have an application note which describes some of the detail of how to measure what is going on.
I'm pretty new to ARM and am trying to get timing results for functions written in C for a Cortex-M4 processor. Would any of you be able to tell me what steps I need to take to get timing results?
I've been running my code on Keil uVision, but I'm unable to use the program's Performance Analyzer during a real-environment debug. From what I've read it seems that the Performance Analyzer only works outside of simulated debug sessions if one is using proprietary connector from Keil.
Set a pin high at the start of the function you wish to time, set it low at the end, and measure the pulse width with an oscilloscope.
Dending on which Cortex M4 you're using there may be a cycle count register DWT->CYCCNT, but the inclusion of such is vendor defined. Details can be found in the Cortex M4 Technical Reference Manual. Your processors datasheet, reference manual and programming manual should provide more information if required.
Alternatively, if you have a fast timer, such as the SysTick running from the processor clock, you could initialise the count to 0x00FFFFFF, start it downcounting at the beginning of your function and stop it at the end, you can then work out the time taken as (0x00FFFFFF - SysTick->CVR) * (1 / SysTick Frequency) .
I'm trying to put a cortex m4 processor to sleep for a little less than a second. I want to be able to tell it to sleep, then a second later, or when a button is pressed, pick up right where I left off. I've looked in the reference manual and VLPS mode looks like it would fit my needs. I don't know how to begin to enter that mode or how to program the NVIC.
More Info:
I am doing this in C, on the bare metal.
You can download and inspect the code that implements this demo. Although the demo is for an RTOS the code used to place the CPU into a sleep mode is the same whether an RTOS is being used or the application is running on bare metal.
There are generic things you can do to place a Cortex-M3 core into a low power state (see the WFI instruction). To get extreme low power then you have to do chip specific things as well. The above linked code performs some chip specific pre-sleep processing (turn of peripherals, set the chips own sleep mode, etc.) before calling WFI, then does some chip specific things when it returns from the WFI instruction.
You don't need a RTOS in order to wake up from sleep a Cortex M4, what you need is to use and interrupt (ISR) you should refer to the manufacturer manual, you may wake up with a timer(ISR) or a button(GPIO) depending of the sleep-hibernation modes of your particular chip. Here is a ARM document more in depth about it.
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0553a/BABGGICD.html
I need to create a driver for a flash memory chip connected to a STM32 Cortex M3 MCU. The chip is controlled via an SPI bus. I intended to use integrated SPI peripheral of the MCU, but unfortunately it only supports 8- or 16-bit data packets while the flash chip commands are 14 bit long. Thus, I have to implement the protocol from scratch using GPIOs. My question is: what is the right way to ensure correct timings of the signals? I currently think of inserting delays between asserting and deasserting GPIO lines with interrupts disabled, but it seems fairly unreliable to me. Are there any better methods?
Jeb's answer is the preferred method and you should use the hardware SPI if possible, and if DMA is an option that is nice as well.
If you for some reason find out that you cannot use the hardware SPI, but that you must implement it using "bit-banging" over GPIO, you should check what options there are available in the timer/PWM hardware on the MCU. You cannot and should not use blunt "hobbyist burn-away delays" as in the link you posted, the real-time performance will be crap and you will occupy the CPU 100%.
Most MCU timers come with a pin output feature, that would allow a pin to change state when the timer elapses. The pseudo code would then be:
Determine if the next bit to send is 1 or 0.
Set the MCU polarity register accordingly, so that it will switch the pin to a high or low level.
When the timer elapses, you need to set the polarity once again, likely through an interrupt. How to do this is very hardware-dependent.
At the same time as you bit-bang the data (MOSI), you also need to generate the clock and chip select. The clock can be generated in the same way as the data, or possibly through a PWM signal if that option is available. Chip select is the easiest part as you only need to pull a pin low during the data transmission.
Finally, there is most likely some application note or official example over how to write a software SPI for your particular MCU.
I would recommend to use the build in SPI and DMA if possible!
You could remapping your data into an array of bytes with a size of a multiple of 14bits.
So you have to send a multiple of 7*4Bits=28bytes each time.
Then you can use the standard SPI with 8Bit-size.
But this should be much faster with SPI/DMA than bit banging the GPIO's.
Some devices that use obscure data lengths are designed so that at the start of a transaction they will either ignore all "0" bits that are clocked in before the first "1", or all "1" bits that are clocked in before the first "0". If your device happens to be designed in such a fashion, you may be able to use 8- or 16-bit SPI mode by clocking out two "junk" bits along with the bits of interest.