I am currently developing an embedded application on the Atmel SAML21J microcontroller, and I have 256KB Flash memory, and a 40KB SRAM memory. When I program my app on the MCU, I have the following message :
Program Memory Usage 66428 bytes 24,6 % Full
Data Memory Usage 29112 bytes 71,1 % Full
It seems to mean that even before I start to run my code, I already have a 71% full RAM.
I would like to know the following things:
what is defined in the RAM, and what is defined in the Flash ?
can I do something to use more of my Flash (that is only 24% full) to save space on the SRAM, and how ?
I saw a ".ld" file that specifies the size of my stack : will it leave me more space in the RAM if I make it higher ?
In this .ld file, is the memory (Flash + SRAM) considered as one unique memory entity ? (meaning that the addresses of the SRAM starts and the end of the flash, for example ?)
Even if I read a lot of things on this subject, this is still shady to me, and I would really appreciate if you guys enligthened me on that.
Thanks.
Where and what placed (defined):
Stack (local variables placed in stack), all global variables, functions that specificied with special keyword (for ex. __ramfuc for IAR) as runned from RAM - are placed in RAM.
All functions (no differents where it's will run), all constants, variables initialization values are placed in Flash. Worth mentioning for AVR you need to use keyword PROGMEM to place any constant to Flash (functions don't need that), while for ARM keyword const will be enough.
For save RAM space you can (in order of effectiveness):
place big tables and text constants (debug messages too) in Flash
merge global buffers (with unions) and use it for differents task in different time
reduce size of stack, there could be problems with stack overflow - so you must reduce functions nesting
use bitmasks for global flags instead of bytes
If you insrease stack size: since stack placed in RAM so you increase RAM usage.
Flash and RAM memories have different address ranges, so from .ld file
you can know where each variable or function aligned by linker:
/* Memories definition in *.ld file */
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
ROM (rx) : ORIGIN = 0x8000000, LENGTH = 1024K
}
/* Sections */
SECTIONS
{
/* The program code and other data into ROM memory */
.text :
{
...
} >ROM
}
There we have:
128Kb of RAM address range [0x20000000, 0x2001FFFF]
1Mb of Flash address range [0x08000000, 0x080FFFFF]
And example how section text placed to Flash memory.
And theh after success compile project you can open file ./[Release|Debug]/output.map for see where each functions and variables are placed:
.text.main 0x08000500 0xa4 src/main.o
0x08000500 main
...
.data 0x20000024 0x124 src/main.o
0x20000024 io_buffer
Function main is placed in Flash memory, global variable io_buffer is placed in RAM memory.
Related
I'm working on an embedded project with FreeRTOS, where I only use static memory allocation.
Looking at my linker script, I find that the following are taking up RAM space:
.data
.bss
._user_heap_stack
To my knowledge, ._user_heap_stack is used during the linking process to see if there is enough RAM space for the user-specified minimum MSP stack size. Here is a relevant snippet in my linker script:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(8);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(8);
} >RAM
I believe that MSP will always be initialized to point to the end of RAM regardless of _Min_Stack_Size, and decrement from there and data is pushed onto the stack. I see that my startup .S file configures sp as follows:
_estack = 0x20004000; /* end of RAM */
Reset_Handler:
ldr sp, =_estack /* Atollic update: set stack pointer */
As for FreeRTOS tasks, they each have stack space that is statically allocated, so it has nothing to do with _user_heap_stack I think?
My question is, with the RAM allocated .data, .bss, and _user_heap_stack, I still have some unallocated RAM, so what happens to those RAM? Is it used by anything? Is it ever useful to reserve some free RAM (i.e. non-statically allocated RAM) or is it just wasted? Or perhaps it is just extra space for MSP to use if the main stack ever grows larger in size than what's specified in _Min_Stack_Size?
TL;DR - The remaining RAM is used by the stack.
Or perhaps it is just extra space for MSP to use if the main stack ever grows larger in size than what's specified in _Min_Stack_Size?
Yes, this seem correct. See the last paragraph for more; it is not just the stack that is bigger.
See: this part,
_estack = 0x20004000; /* end of RAM */
Reset_Handler:
ldr sp, =_estack /* Atollic update: set stack pointer */
So at least the BOOT sp will be at the end of RAM.
The part with . = . + _Min_Stack_Size; just makes sure you have a MINIMUM stack or a linker error happens. Your stack is actually bigger and it is used at least at boot. I know nothing about FreeRTOS, but I suspectnote1 it is the system stack and you have user stacks. Each mode on the ARM has a separate stack. If FreeRTOS has any memory protection or privileged levels, then you will have multiple stacks. So one task crashing (due to stackoverflow, etc) won't crash the entire system. Just that tasksnote2 stack is corrupt, and not the one that manages the entire system.
It is a common idiom to have stack and heap together. With heap growing up and stack growing down. In this way, the MIN heap size and MIN stack size are imaginary. Eventually they will collide when the size of both are the total size. But things maybe okay if the stack goes into heap logical space or heap goes in to stack logical space AS long as it is not in use by the other. By space, I mean the constants in your linker file and not actual in use values.
Note1: It would kind of be insane to both waste memory and have your RTOS code using the same stack as all tasks. At least it would not be a robust OS.
Note2: By task I mean a schedulable entity. Maybe a process, task, thread, fiber, etc.
Lately I've been studying the linker scripts used in auto-generated STM32 projects, and I'm a little bit confused about how the stack and heap memory segments are defined.
As an example, I've been looking at the files provided in ST's "CubeMX" firmware package for their F0 lines of chips, which have ARM Cortex-M0 cores. I'd paste a whole script if the files' licenses allowed it, but you can download the whole package from ST for free if you're curious1. Anyways, here are the parts relevant to my question:
/* Highest address of the user mode stack */
_estack = 0x20001000; /* end of RAM */
/* Generate a link error if heap and stack don't fit into RAM */
_Min_Heap_Size = 0x200; /* required amount of heap */
_Min_Stack_Size = 0x400; /* required amount of stack */
<...>
SECTIONS {
<...>
.bss :
{
<...>
} >RAM
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(8);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(8);
} >RAM
<...>
}
So here's my probably-incorrect understanding of the linker's behavior:
The '_estack' value is set to the end of RAM - this script is for an 'STM32F031K6' chip which has 4KB of RAM starting at 0x20000000. It is used in ST's example vector tables to define the starting stack pointer, so it seems like this is supposed to mark one end of the 'Stack' memory block.
The '_Min_Heap_Size' and '_Min_Stack_Size' values seem like they are supposed to define the minimum amount of space that should be dedicated to the stack and heap for the program to use. Programs that allocate a lot of dynamic memory may need more 'Heap' space, and programs that call deeply-nested functions may need more 'Stack' space.
My question is, how is this supposed to work? Are '_Min_x_Space' special labels, or are those names maybe slightly confusing? Because it looks like the linker script just appends memory segments of those exact sizes to the RAM without consideration for the program's actual usage.
Also, the space defined for the Stack does not appear to necessarily define a contiguous segment between its start and the '_estack' value defined above. If there is no other RAM used, nm shows that the '_user_heap_stack' section ends at 0x20000600, which leaves a bunch of empty RAM before '_estack'.
The only explanation I can think of is that the 'Heap' and 'Stack' segments might have no actual meaning, and are only defined as a compile-time safeguard so that the linker throws an error when there is significantly less dynamic memory available than expected. If that's the case, should I think of it as more of a minimum 'Combined Heap/Stack' size?
Or honestly, should I just drop the 'Heap' segment if my application won't use malloc or its ilk? It seems like good practice to avoid dynamic memory allocation in embedded systems when you can, anyways.
You ask the question where to place the stack and the heap. On uC the answer is not as obvious as #a2f stated for many reasons.
the stack
First of many ARM uC have two stacks. One is called Master Stack and the second one Process Stack. Of course you do not need to enable this option.
Another problem is that the Cortex uC may have (for example STM32F3, many F4, F7, H7) many SRAM blocks. It is up to developer to decide where to place the stack and the heap.
Where to place the stack?
I would suggest to place MSP at the beginning of the chosen RAM. Why?
If the stack is placed at the end you do not have any control of the stack usage. When stack overflows it may silently overwrite your variables and the behavior of the program becomes unpredictable. It is not the issue if it is the LED blink thing. But imagine a large machine controller or car breaks computer.
When you place the stack at the beginning of the RAM (as beginning I mean RAM start address + stack size) when the stack is going to overflow the hardware exception is generated. You are in the full control of the uC, you can see what caused the problem (for example damaged sensor flooding the uC with data) and start the emergency routine (for example stop the machine, put the car into the service mode etc etc). The stack overflow will not happen undetected.
the Heap.
Dynamic allocation has to be used with the caution on the uCs. First problem is the possible memory fragmentation of the available memory as uC have very limited resources. Use of the dynamically allocated memory has to be considered very carefully otherwise it can be a source of serious problems. Some time ago USB HAL library was using dynamic allocation in the interrupt routine - a fraction of a second was sometimes enough to fragment the heap enough disallowing any further allocation.
Another problem is wrong implementation of the sbrk in the most of the available toolchains. The only one I know with the correct one is the BleedingEdge toolchain maintained by our colleague from this forum #Freddie Chopin.
The problem is that the implementations assume that the heap and the stack grow towards each other and eventually can meet - which is of course wrong. Another problem is improper use and initialization of the static variables with the addresses of the heap start and end.
The '_estack' value is set to the end of RAM - this script is for an 'STM32F031K6' chip which has 4KB of RAM starting at 0x20000000. It is used in ST's example vector tables to define the starting stack pointer, so it seems like this is supposed to mark one end of the 'Stack' memory block.
As the stack here would grow downwards (from high to low addresses), it's actually the start of the stack memory region.
Are '_Min_x_Space' special labels, or are those names maybe slightly confusing?
The thing special about them is that symbols starting with an underscore followed by an uppercase letter are reserved for the implementation. e.g. min_stack_space could clash with user-defined symbols.
Because it looks like the linker script just appends memory segments of those exact sizes to the RAM without consideration for the program's actual usage.
That's the minimum size. Both the stack and the heap break may grow.
If there is no other RAM used, nm shows that the '_user_heap_stack' section ends at 0x20000600, which leaves a bunch of empty RAM before '_estack'
It leaves exactly 0x400 bytes, which is _Min_Stack_Size. Remeber stack grows downwards here (and often elsewhere as well).
seems like good practice to avoid dynamic memory allocation in embedded systems when you can, anyways.
Not everything is safety-critical. You're free to not use the heap if you don't want/need/are allowed to. (Ok, not that free in the latter)
There are different memory segments such as .bss, .text, .data,.rodata,....
I've failed to know which of them locates in RAM and which of them locates in FLASH memory, many sources have mentioned them in both sections of (RAM & ROM) memories.
Please provide a fair explanation of the memory segments of RAM and flash.
ATMEL studio compiler
ATMEGA 32 platform
Hopefully you understand the typical uses of those section names. .text being code, .rodata read only data, .data being non-zero read/write data (global variables for example that have been initialized at compile time), .bss read/write data assumed to be zero, uninitialized. (global variables that were not initialized).
so .text and .rodata are read only so they can be in flash or ram and be used there. .data and .bss are read/write so they need to be USED in ram, but in order to put that information in ram it has to be in a non-volatile place when the power is off, then copied over to ram. So in a microcontroller the .data information will live in flash and the bootstrap code needs to copy that data to its home in ram where the code expects to find it. For .bss you dont need all those zeros you just need the starting address and number of bytes and the bootstrap can zero that memory.
so all of them can/do live in both. but the typical use case is the read only ones are USED in flash, and the read/write USED in ram.
They are located wherever your project's linker script defines them to be located.
Some targets locate and execute code in ROM, while others may copy code from ROM to RAM on start-up and execute from RAM - usually for performance reasons on faster processors. As such .text and .rodata may be located in R/W or R/O memory. However .bss and .data cannot by definition be located in R/O memory.
ROM cannot be written to, but RAM can be written to.
ROM holds the (BIOS) Basic Input / Output System, but RAM holds the programs running and the data used.
ROM is much smaller than RAM.
ROM is non-volatile (permanent), but RAM is volatile.
I'm developing an application on an ARM Cortex-M microcontroller which has two RAM banks à 64kB. The first bank is directly followed by the second bank in the memory map.
The memory banks are currently split into two regions in my linker script. The first region contains the sections .bss and .data. The second bank is used for .heap and .stack, which only take 1kB each (I'm using a different stack in FreeRTOS, which also manages it's own heap).
My problem is, that .bss is too large for the first bank. Therefore I'd like to move some of it's content to the second bank.
One way to accomplish this would be to create a new section, lets call it .secondbss, which is linked to the second bank. Single variables could then be added to this section using __attribute__((section(".secondbss"))).
The reasons why I am not using this solution are
I really want to maintain portability of my source code
There might be a whole lot of variables that would require this attribute and I don't want to choose the section for every single variable
Is there a better solution for this? I already thought of both memories as one region, but I don't know how to prevent the linker from misaligning the data across the boundary between both banks.
How can I solve my problem without using __attribute__ flags?
Thank you!
For example you have 2 banks at 0x20000000 and 0x20010000. You wants use Bank2 for heap and (main) stack. I assume that you have large .bss because of configTOTAL_HEAP_SIZE in FreeRTOSConfig.h. Now see heap sources in FreeRTOS/Source/portable/MemMang/. There are 5 implementations of pvPortMalloc() that do memory allocation.
Looks at lines in heap_X.c that you use
/* Allocate the memory for the heap. */
#if( configAPPLICATION_ALLOCATED_HEAP == 1 )
/* The application writer has already defined the array used for the RTOS
heap - probably so it can be placed in a special segment or address. */
extern uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#else
static uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#endif /* configAPPLICATION_ALLOCATED_HEAP */
So you can set configAPPLICATION_ALLOCATED_HEAP at 1 and say to you linker to place ucHeap at 0x20010000.
Another way is writing headers for each device that includes addresses of heap and stack and edit sources.
For heap_1.c we can do next changes:
// somewhere in devconfig.h
#define HEAP_ADDR 0x20010000
// in heap_1.c
// remove code related ucHeap
//
// remove static uint8_t *pucAlignedHeap = NULL;
// and paste:
static uint8_t *pucAlignedHeap = (uint8_t *)HEAP_ADDR;
For heap_2.c and heap_4.c edit function prvHeapInit() as well.
Pay attention to heap_5.c that includes vPortDefineHeapRegions().
Now pvPortMalloc() will returns pointers to memory in Bank2. pvPortMalloc() used for allocations stacks of tasks, TCB and user varables. Read sources. Location of main stack depends of your device/architecture. For stm32 (ARM) see vector table or how to change MSP register.
I'm searching for a way to see the RAM usage of my application running on an at32uc3b0512.
arv32-size.exe foo.elf tells me:
text data bss dec hex filename
263498 11780 86524 361802 5854a foo.elf
According to 'google', RAM usage is .data + .bss. But .data + .bss is already (11780+86524)/1024 = 96kb, which would mean that my RAM is full (at32uc3b0512 -> 96kb SRAM). But the application works as desired. Am I wrong???
The chip you are using has 96kB of RAM and that is also the sum of your .bss and .data sections. This does not mean that all of your RAM is being used up, rather it is merely showing how the RAM is being allocated.
The program on MCU is usually located in FLASH
this is not true if you have some OS present
and load program to memory on runtime from somewhere like SD card
not all MCU's can do that
I suspect that is not your case
The program Flash is 512 KByte big (I guess from your IC's number)
The SDRAM is used for C engine/OS,stack and heap
your chip has 96 KByte
the C engine is something like OS handling
dynamic allocations,heap,stack,subroutine calls
and including RTL used during compilation
and of coarse dummy Interrupt sub routines for unused interrupts...
When you compile program to ELF/HEX what ever
the compiler/linker tells you only
how big the program code and data is (located in program FLASH memory)
how big static variables you have
the rest is unknown until runtime itself
So if you need to know how big chunk of memory you take
then you need to extract it from runtime
by some RTL call to get memory status
or by estimating it yourself based on knowledge of
what your program does
how much of dynamic memory is used
heap/stack trashing/usage
recursions level, etc...
Or you can try to increasingly allocate memory until you hit out of memory
and count how big chunk you allocated altogether
then release it of coarse
the used memory is then ~ 96KB - altogether_allocated_memory
(+/-) granularity ...