I keep reading about the ART accelerator on the STM32F407VGT6 and zero wait states at 168MHz and it is mentioned briefly in the data sheet but I can't find anywhere how to enable it.
There is a prefetch option in the FLASH_ACR register but enabling it causes my board to stop
working. Not sure if this is the same thing. Is ART even something that needs to be enabled or is it enabled by default? I notice in the errata for this chip that the ART was not working in A version but has been "fixed" in the current Z rev.
Thanks
This link has the method of configuring the ART for stm32F2 devices. The same registers are found in the STM32f4 discovery series. Check for the word ART configuration in the pdf. I'm having the same problem and this is the best answer I could find...
I read in that document: STMicroelectronics: AN3430 - Application Note of How to achieve the lowest current consumption with STM32F2xx
On page 17 (3.1.1) they say that enabling the buffer prefetch will consume additional 20mA (for 128-bit-line fetching).
Maybe at 168 MHz you power supply gets unstable??
Maybe you can try it with lower clock speed (120MHz), where the cpu is very power efficient.
Related
I am designing the controller and data acquisition unit for a rocket engine test stand. This system needs to control a number of actuators on the test stand and also be able to transmit collected data back to the host computer where the team will be watching live data/camera feeds from safety.
The overall design requirements are as follows:
Acquire data from ~15 analog sensors at 1KHz
Control the actuators on the test stand including valves and ignition switches
Transmit data back to the host computer in our shelter in real time
Accept control from the host computer for things like manual valve actuation, test sequence modification, sequence abortion, etc.
I am not exactly sure where to begin when laying out the software for this system. I am considering using an STM32 ARM Cortex-M4 processor running at 180 MHz. I am having trouble figuring how I should approach the problem. I have considered using an RTOS system but based on what I have seen those generate large overheads as you run them faster as the scheduler has to run each tick. The other idea I'm bouncing around is a state machine combined with some timer-based interrupts for reading and then sending data back out to the PC. Any advice as to how to approach this problem to minimize code complexity would be greatly appreciated. Thanks.
EDIT:
I have been told to clarify a number of things concerning the technical specs of the system.
My actuators consist of:
6 solenoids (controlled digitally through relays/MOSFET, and switched around once a second)
2 DC motors (driven with PWM outputs in a PID loop, need to be able to ramp position controllably)
One igniter, again controlled through a relay/MOSFET
My sensors consist of:
8 pressure transducers (analog voltages)
4 thermocouples (analog voltages)
2 motor encoders (quadrature encoders)
1 light sensor (analog voltage)
1 Load cell (analog voltage)
Ideally all of the collected data (all of the above sensors) plus some additional data (timestamps, motor set positions, solenoid positions) is streamed back to the host computer at in real time.
Given the motor control with PWM & PID, you need to specify a desired resolution, either in PWM timer ticks or ADC reads. This is the most critical part. It doesn't hurt if the ADC has greater resolution than your specified resolution either. The PCB has to be designed accordingly, with sufficient resolution on resistors etc.
After you've done this, find MCU with sufficiently accurate ADC. I would imagine that 12 bit resolution is enough for most applications, but I don't know your specific case.
Next, you need to decide how fast you want the PID to be. Should an output on the PWM result in a read on the ADC in the next cycle, or could you settle for slower response? The realtime bottleneck here will be the ADC conversion clock, not the CPU.
The rest of the system doesn't seem time critical at all - you just have to ensure that everything is read/set synchronously. The data transmission to/from the host should preferably be done over CAN since it comes with hard real-time characteristics. Doesn't seem that you need a whole lot of bandwidth.
I have designed systems very similar to this using bare metal 16 bit MCUs running on 16MHz. Processing speed is really not a big concern, but meeting real-time deadlines is. That means you can forget about using Linux toys like Rasp PI, it's completely out of the question. And a RTOS is likely overkill since it mostly adds additional complexity.
A bare metal Cortex M with sufficient ADC resolution and CAN seems like a good choice. If you can stay away from floating point, that's nice too - depends on how advanced math you need. If you need nothing more advanced than PID, it can be implemented with fixed point just fine. (Or PI rather, since that usually works best for fast motor control systems.)
We have an application which runs on PIC24H, we would like to port it to another MCU, preferably ARM Cortex. Application is extremely time critical, meaning that we need extremely deterministic code behaviour. In short, there are pulses which are obtained via special hardware to GPIO pins, data is analyzed right away. Processing of data is not complex(we don't need a beefy cpu/mcu to do it). After analyzing the data GPIO output pins are written to their values.
App in 3 short lines:
process input pins
determine pattern within processing of input pins
based on the received pattern write output pins
PIC24H is working at 40MHz, we can toggle the pin in 25ns, we would be grateful with at least 2x speed for future upgrades. So MCU which can run deterministic code and toggle pins with at least 80MHz (12.5ns) would be just fine. We don't need toggling of the pins at constant fast rate, we need a mcu which can toggle it in less than 25ns. We can't waste cycles while toggling, if one cycle is off we loose synchronization. Everything must be done in one cycle precision(or two but constant two cycles), so code should be 100% deterministic.
Please let me know if I'm missing something or if what we need can be done using some other methods on Cortex-M. Just keep in mind that if one cycle is lost(due cache or similar) we loose signal sync and app will not do it's work right or at all.
Thanks!
Br
According to this blog post, the interrupt latency for Cortex-M ranges from 12 to 16 cycles (assuming you are not using FPU registers) with best-case memories. M0 and M0+ are slower than M3/M4/M7. On top of this, you need to add the GPIO access times (and watch out for different clock frequencies between the core and the peripherals. Cortex-M7 will suppport higher clock speeds than M3/M4.
It still isn't clear how many cycles are consumed in recognising a pattern, and how an interrupt is useful in doing this - generally a low latency interface function like this would be an obvious target for dedicated hardware, but since you have an existing software solution it seems the problem is mis-specified.
Providing you avoid accessing any 'slow' peripherals which might stall the bus, the interrupt latency should be deterministic - any specific device should have documentation which covers this.
NXP have an application note which describes some of the detail of how to measure what is going on.
I'm pretty new to ARM and am trying to get timing results for functions written in C for a Cortex-M4 processor. Would any of you be able to tell me what steps I need to take to get timing results?
I've been running my code on Keil uVision, but I'm unable to use the program's Performance Analyzer during a real-environment debug. From what I've read it seems that the Performance Analyzer only works outside of simulated debug sessions if one is using proprietary connector from Keil.
Set a pin high at the start of the function you wish to time, set it low at the end, and measure the pulse width with an oscilloscope.
Dending on which Cortex M4 you're using there may be a cycle count register DWT->CYCCNT, but the inclusion of such is vendor defined. Details can be found in the Cortex M4 Technical Reference Manual. Your processors datasheet, reference manual and programming manual should provide more information if required.
Alternatively, if you have a fast timer, such as the SysTick running from the processor clock, you could initialise the count to 0x00FFFFFF, start it downcounting at the beginning of your function and stop it at the end, you can then work out the time taken as (0x00FFFFFF - SysTick->CVR) * (1 / SysTick Frequency) .
I built the code from the STM32CubeF4 for the USB CDC example. I added the missing receive code for CDC_Receive_FS() in usbd_cdc_if.c.
I loaded this into my STM32F4 Discovery and it works. A character typed on Tera Term returns and is displayed on Tera Term.
I am hoping that someone here, could give me some knowledge about how this USB CDC firmware works, specifically, is this being driven by an interrupt that is generated when there is a level shift in voltage on the USB -D and +D pins, or is there an infinite while loop that was launched somewhere, and it's just polling waiting for some data to appear?
What prompted my question is that I see that one can blink the LEDs on this board by toggling the state of the GPIO pins within an infinite while loop in main.c. However, there is nothing within this while loop at all within main.c for USB. So how does this USB CDC firmware get and send a character from/to Tera Term.
I will take the 2 minutes to answer you instead of lecturing you. Receive is done through interrupts. Very, very simply, the hardware sees the voltage change on the D+/D- and flags an interrupt based on the intialization functions. The interrupt calls HAL_PCD_IRQHandler, which calls USBD_LL_DataInStage in the usbd_conf.c file. That ends up calling the function USBD_CDC_DataIn in the usbd_cdc.c file. There is your starting point, but it is not simple. To do what you want you might have to stop the output to UART and just handle it in the main loop.
This question is to broad for this forum and not an actual question for a specific problem. However, as some hints, you might
Read the USB-specs, at least some basic overview (just start at wikipedia). USB does not work by toogling a GPIO in software (see next point)
Read the STM32F4xx reference manual. This is quite comprehensive.
Read the source code of the demo. This should answer all questions.
To track execution paths, you should remember that C always starts with the main() function, so this is a good start to see what's going on. (disclaimer: I know pretty well, it starts with startup, but this might confuse a beginner even more).
If you want to work with USB, you will have to do this all anyway, so you might start with it as well right now. Yes, this will take some time; no surprise, engineers have learned all this for years before they start with larger projects.
All information is available legal and for free on the web.
And, yes, USB is most likely interrupt-driven and might also use DMA to transfer data.
I work in Code Composer Studio Version: 6.0.1.00040 with the card LCDK C6748.
In this card there is LINE_OUT for sampling out audio into speakers.
My question arises, because I encountered some phenomena that look like I reached a limit value when I assigened a value to LINE_OUT:
codec_data.channel[LEFT]= (uint16_t)outputLeft_referenceSignal;
// this union is where I have to "place" the audio sample I create,
// but I suspect outputLeft_referenceSignal exceed the limit value
When it happens it sounds, like a cracked "PACK" in the speakers and then the expected audio signal is not played
The T.I. has complete code examples on how to handle each of the built-in peripherals of the C6847 DSP.
I strongly suggest you start searching/reading the T.I. web site for info on the C6748 DSP
amongst other things, like initializing the DSP, you need to understand the usage of the McASP and the AIC31 peripherals.
It is not a simple write to a I/O address.
If you have setup the above peripherals, please post the relevant code so we can determine the underlying problem.