Is there a case when const data should be load in RAM rather then direct flash access - c

I've used few types microcontrollers.
When I write code like this:
const int var = 5;
usually var is kept in flash. I understand that const variables are not always kept only in flash. Sometimes (depending compiler, processor, options like pic etc.) they are loaded from flash to RAM before main. Is there a case, when it is better to load var into RAM?

Many microcontrollers have a unified address space, but some (such as AVRs) do not. When there are multiple address spaces, then you may need to use different instructions to access data in those spaces. (For AVR, the LPM (Load Program Memory) instruction must be used to access data in Flash, instead of one of the ordinary LD (Load) instructions.)
If you imagine a pointer to your const variable, a function receiving the pointer would not be able to access the data unless it knows which address space the pointer points into. In a case like that, you either have to copy the data to RAM, even though it's const, or the user of the pointer has to know which instruction to use to access the data.

Assuming a Microcontroller architecture like ARM Cortex or Microchip MIPS (and many others), RAM and Flash are mapped to different parts of the internal address space, like a huge array. So the Assembly commands reading from RAM are the same like reading from Flash. No difference here.
Access times of RAM and Flash shouldn't be too different, so no waiting needed on any of the controllers I've worked with.
The only case I can imagine where storing const vars in flash could cause problems is in some sort of bootloader app, when the flash is written. Of course, writing to a flash range where you are executing from is a bad idea and will cause much heavier problems than overwritten const values.

Related

Why do MCU compilers for chips like AVR or ESP (used widely by Arduino) keep all strings in SRAM heap by default? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 months ago.
Improve this question
There is a common technique used in Arduino world, where you can use PROGMEM macros in order to keep strings and other similar data in flash memory instead of SRAM to keep lower RAM usage, while sacrificing some performance - https://www.arduino.cc/reference/en/language/variables/utilities/progmem/
Basically, instead of storing these in SRAM, there is just some reference to a FLASH address where the string is stored and loaded from on the fly, in order to save RAM.
But I can't understand why do MCU compilers put all strings including local strings from functions into heap memory and keep them there all the time in the first place. Also I don't understand how compiler can "store anything in RAM instead of flash" - RAM is volatile, so compiler can hardly "store" anything in there as it's cleared on every reset. These strings still must be present in program image stored on FLASH, so why does it copy them from FLASH to RAM on each launch of the MCU? I was thinking that maybe whole program image must be loaded into RAM for execution, but then that doesn't make sense as these chips use harvard architecture and program is executed from FLASH already (and most of these chips have much bigger FLASH than RAM anyway, so whole image would never fit into RAM).
While I understand how to use workarounds that prevent this behaviour, I can't understand why this behaviour exists in the first place. Can someone shed some light on it? Why are all strings loaded into HEAP on start of the program by default? Is that for performance reasons?
The AVR architecture is different from many other common architectures in that the the code and data exist in completely different memory spaces (though the program memory can be accessed as data, as shown in PROGMEM documentation page to which you linked). This is one type of modified Harvard architecture.
Most other architectures that you're likely to use present themselves to the user as having code and data exist in the same memory space. While this is often also done with a modified Harvard architecture, they present themselves to the user as a von Neumann architecture, having a unified code and data memory space.
On AVR, to make initialized global or static data available to use as any other in-memory data, part of the program startup code copies the initialization data from program memory into RAM. This is generally done to program segments with names like .data or .rodata, depending on whether or not the variables in question are const.
Note that, contrary to what you say in your question, this data is not copied to the heap, it's stored in some portion of RAM chosen during program linking.
Using PROGMEM and the associated functions, you can directly access the data stored in the flash memory of the AVR device. This constant data is placed in a segment that won't be copied to RAM on startup, like .progmem.data, and so doesn't have space in RAM reserved for it.
The case with the Xtensa architecture, used by the ESP8266 and some members of the ESP32 family, is completely different. Contrary to what you state in your question, I don't believe that static or global objects which are const are copied into RAM by default, only those which can be modified (the .data segment would be copied as initialization to RAM, while the .rodata segment would not be).

C loading a const

const arrays get loaded into flash, or else they are in RAM.
How can I load a large const array, apart from typing in thousands of numbers by hand?
I am using the IAR compiler with an STM32F303 (Cortex M4)
You always can write an application which will generate the array from what you need and then just include it in your source file.
What are the numbers? Typically you can use off-line tools to generate C code that holds the numbers in a const array of the suitable type. This is often done for i.e. look-up tables in embedded software, and so on.
You cannot do this at run-time, since it's the linker's job to arrange the segments of the program into the various available memory-blocks.
Also, flash memory is not generally "easy" to write, i.e. you can't typically expect to be able to have a regular C pointer into flash, and just write to it and have it "stick". Programming flash generally requires dancing with the flash memory controller, and keeping in mind things like block erasure, erasing time, minimum programming page size, programming time per page, and so on. Flash memory is not so much RAM, as it is ROM that happens to be reprogrammable in software if you know how.

Force Variable to be Stored in FLASH in C Using ARM Processor

I know I can force an array into FLASH in ARM by declaring it "const". But this is not truly an array of consts: I want to be able to write to it regularly. I have three large arrays that take up ~50k of the 128kB of SRAM I have available, but I have an order of magnitude more FLASH than I need. How can I force these three arrays into FLASH without declaring them const? Using IAR, BTW.
Tried using the __no_init keyword; according to the linker map files this had no effect.
To answer the original question, you can write a linker script to force any variable to reside in a predetermined area of memory (declaring them const does not force the compiler to put it in FLASH, it is merely a strong suggestion).
On the other hand, overabundance of FLASH is not in itself a good reason to keep a non-const array in flash. Among the reasons are: 1) depending on the chip, the access to FLASH memory can be much slower than RAM access (especially for writing) 2) FLASH can only be rewritten a limited number of times: it is not a problem for an occasional rewrite but if your code constantly rewrites FLASH memory, you can ruin it rather quicky. 3) There are special procedures to write to FLASH (ARM makes it easy but it is still not as simple as writing to RAM).
The C language, compilers, etc are not able to generate chip/board specific flash routines. If you wish to use flash pages to store read/write data you are going to have to have at least a page worth of ram and some read/write routines. You would need to have very many of these variables to overcome the cost of ram and execution time needed to keep the master copy in flash. In general every time you write the value to flash, you will need to read out the whole page, erase the page, then write back the whole page with the one item changed. Now if you know how your flash works (Generally it is erase to ones and write zeros) you could read the prior version, compare differences and if an erase is not needed then do a write of that one item.
if you dont have dozens of variables you wish to do this with then, dont bother. You would want to declare a const something offset for each variable in this flash and have a read/write routine
const unsigned int off_items=0x000;
const unsigned int off_dollars=0x004;
...
unsigned int flash_read_nv ( unsigned int offset );
void flash_write_nv ( unsigned int offset, unsigned int data);
so this code that uses .data:
items++;
dollars=items*5;
Using your desire to keep the variables in flash becomes:
unsigned int ra,rb;
ra= flash_read_nv(off_items);
rb= flash_read_nv(off_dollars);
ra++;
rb=ra*5;
flash_write_nv(off_items,ra);
flash_write_nv(off_dollars,ra);
And of course the flash writes take hundreds to thousands of clock cycles or more to execute. Plus require 64, 128, or 256 bytes of ram (or more) depending on the flash page size.
Believe it or not, this page is the best thing that pops up when looking how to store data in flash with the AVR ARM compiler.
On page 359 of the manual (the one that comes with v7.50 of the compiler), does show this as a way to put data in flash:
#define FLASH _Pragma("location=\"FLASH\"")
On page 332 of the manual, however, it says this:
"Static and global objects declared const are allocated in ROM."
I tested it and it seems the pragma is unnecessary with the IAR compiler, as specifying const puts it in flash already (as stated in the original question).
From the other answers, it seems like the OP didn't really want to use flash. If anyone like me comes to this page to figure out how to store data in flash with the AVR ARM compiler, however, I hope my answer saves them some time.
In IAR, you can declare your array as follows:
__root __no_init const uint8_t myVect[50000] #0x12345678
Where, of course, 0x12345678 is the address in FLASH.

How to use external memory on a microcontroller

In the past, I've worked a lot with 8 bit AVR's and MSP430's where both the RAM and flash were stored on the chip directly. When you compile and download your program, it sort of "just works" and you don't need to worry about where and how variables are actually stored.
Now I'm starting a project where I'd like to be able to add some external memory to a microcontroller (a TI Stellaris LM3S9D92 if that matters) but I'm not entirely sure how you get your code to use the external RAM. I can see how you configure the external bus pretty much like any other peripheral but what confuses me is how the processor keeps track of when to talk to the external memory and when to talk to the internal one.
From what I can tell, the external RAM is mapped to the same address space as the internal SRAM (internal starts at 0x20000000 and external starts at 0x60000000). Does that mean if I wrote something like this:
int* x= 0x20000000;
int* y= 0x60000000;
Would x and y would point to the first 4 bytes (assuming 32 bit ints) of internal and external RAM respectively? If so, what if I did something like this:
int x[999999999999]; //some super big array that uses all the internal ram
int y[999999999999]; //this would have to be in external ram or it wouldn't fit
I imagine that I'd need to tell something about the boundaries of where each type of memory is or do I have it all wrong and the hardware figures it out on its own? Do linker scripts deal with this? I know they have something to do with memory mapping but I don't know what exactly. After reading about how to set up an ARM cross compiler I get the feeling that something like winavr (avr-gcc) was doing a lot of stuff like this for me behind the scenes so I wouldn't have to deal with it.
Sorry for rambling a bit but I'd really appreciate it if someone could tell me if I'm on the right track with this stuff.
Update
For any future readers I found this after another few hours of googling http://www.bravegnu.org/gnu-eprog/index.html. Combined with answers here it helped me a lot.
Generally that is exactly how it works. You have to properly setup the hardware and/or the hardware may already have things hardcoded at fixed addresses.
You could ask the same question, how does the hardware know that when I write a byte to address 0x21000010 (I just made that up) that that is the uart transmit holding register and that write means I want to send a byte out the uart? The answer because it is hardcoded in the logic that way. Or the logic might have an offset, the uart might be able to move it might be at some other control register contents plus 0x10. change that control register (which itself has some hardcoded address) from 0x21000000, to 0x90000000 and then write to 0x90000010 and another byte goes out the uart.
I would have to look at that particular part, but if it does support external memory, then in theory that is all you have to do know what addresses in the processors address space are mapped to that external memory and reads and writes will cause external memory accesses.
Intel based computers, PC's, tend to like one big flat address space, use the lspci command on your Linux box (if you have one) or some other command if windows or a mac, and you will find that your video card has been given a chunk of address space. If you get through the protection of the cpu/operating system and were to write to an address in that space it will go right out the processor through the pcie controllers and into the video card, either causing havoc or maybe just changing the color of a pixel. You have already dealt with this with your avr and msp430s. Some addresses in the address space are flash, and some are ram, there is some logic outside the cpu core that looks at the cpu cores address bus and makes decisions on where to send that access. So far that flash bank and ram bank and logic are all self contained within the boundaries of the chip, this is not too far of a stretch beyond that the logic responds to an address, and from that creates an external memory cycle, when it is done or the result comes back on a read it completes the internal memory cycle and you go on to the next thing.
Does that make any sense or am I making it worse?
You can use the reserved word register to suggest to the compiler that it put that variable into an internal memory location:
register int iInside;
Use caution; the compiler knows how many bytes of register storage are available, and when all available space is gone it won't matter.
Use register variables only for things that are going to be used very, very frequently, such as counters.

Fixed address variable in C

For embedded applications, it is often necessary to access fixed memory locations for peripheral registers. The standard way I have found to do this is something like the following:
// access register 'foo_reg', which is located at address 0x100
#define foo_reg *(int *)0x100
foo_reg = 1; // write to foo_reg
int x = foo_reg; // read from foo_reg
I understand how that works, but what I don't understand is how the space for foo_reg is allocated (i.e. what keeps the linker from putting another variable at 0x100?). Can the space be reserved at the C level, or does there have to be a linker option that specifies that nothing should be located at 0x100. I'm using the GNU tools (gcc, ld, etc.), so am mostly interested in the specifics of that toolset at the moment.
Some additional information about my architecture to clarify the question:
My processor interfaces to an FPGA via a set of registers mapped into the regular data space (where variables live) of the processor. So I need to point to those registers and block off the associated address space. In the past, I have used a compiler that had an extension for locating variables from C code. I would group the registers into a struct, then place the struct at the appropriate location:
typedef struct
{
BYTE reg1;
BYTE reg2;
...
} Registers;
Registers regs _at_ 0x100;
regs.reg1 = 0;
Actually creating a 'Registers' struct reserves the space in the compiler/linker's eyes.
Now, using the GNU tools, I obviously don't have the at extension. Using the pointer method:
#define reg1 *(BYTE*)0x100;
#define reg2 *(BYTE*)0x101;
reg1 = 0
// or
#define regs *(Registers*)0x100
regs->reg1 = 0;
This is a simple application with no OS and no advanced memory management. Essentially:
void main()
{
while(1){
do_stuff();
}
}
Your linker and compiler don't know about that (without you telling it anything, of course). It's up to the designer of the ABI of your platform to specify they don't allocate objects at those addresses.
So, there is sometimes (the platform i worked on had that) a range in the virtual address space that is mapped directly to physical addresses and another range that can be used by user space processes to grow the stack or to allocate heap memory.
You can use the defsym option with GNU ld to allocate some symbol at a fixed address:
--defsym symbol=expression
Or if the expression is more complicated than simple arithmetic, use a custom linker script. That is the place where you can define regions of memory and tell the linker what regions should be given to what sections/objects. See here for an explanation. Though that is usually exactly the job of the writer of the tool-chain you use. They take the spec of the ABI and then write linker scripts and assembler/compiler back-ends that fulfill the requirements of your platform.
Incidentally, GCC has an attribute section that you can use to place your struct into a specific section. You could then tell the linker to place that section into the region where your registers live.
Registers regs __attribute__((section("REGS")));
A linker would typically use a linker script to determine where variables would be allocated. This is called the "data" section and of course should point to a RAM location. Therefore it is impossible for a variable to be allocated at an address not in RAM.
You can read more about linker scripts in GCC here.
Your linker handles the placement of data and variables. It knows about your target system through a linker script. The linker script defines regions in a memory layout such as .text (for constant data and code) and .bss (for your global variables and the heap), and also creates a correlation between a virtual and physical address (if one is needed). It is the job of the linker script's maintainer to make sure that the sections usable by the linker do not override your IO addresses.
When the embedded operating system loads the application into memory, it will load it in usually at some specified location, lets say 0x5000. All the local memory you are using will be relative to that address, that is, int x will be somewhere like 0x5000+code size+4... assuming this is a global variable. If it is a local variable, its located on the stack. When you reference 0x100, you are referencing system memory space, the same space the operating system is responsible for managing, and probably a very specific place that it monitors.
The linker won't place code at specific memory locations, it works in 'relative to where my program code is' memory space.
This breaks down a little bit when you get into virtual memory, but for embedded systems, this tends to hold true.
Cheers!
Getting the GCC toolchain to give you an image suitable for use directly on the hardware without an OS to load it is possible, but involves a couple of steps that aren't normally needed for normal programs.
You will almost certainly need to customize the C run time startup module. This is an assembly module (often named something like crt0.s) that is responsible initializing the initialized data, clearing the BSS, calling constructors for global objects if C++ modules with global objects are included, etc. Typical customizations include the need to setup your hardware to actually address the RAM (possibly including setting up the DRAM controller as well) so that there is a place to put data and stack. Some CPUs need to have these things done in a specific sequence: e.g. The ColdFire MCF5307 has one chip select that responds to every address after boot which eventually must be configured to cover just the area of the memory map planned for the attached chip.
Your hardware team (or you with another hat on, possibly) should have a memory map documenting what is at various addresses. ROM at 0x00000000, RAM at 0x10000000, device registers at 0xD0000000, etc. In some processors, the hardware team might only have connected a chip select from the CPU to a device, and leave it up to you to decide what address triggers that select pin.
GNU ld supports a very flexible linker script language that allows the various sections of the executable image to be placed in specific address spaces. For normal programming, you never see the linker script since a stock one is supplied by gcc that is tuned to your OS's assumptions for a normal application.
The output of the linker is in a relocatable format that is intended to be loaded into RAM by an OS. It probably has relocation fixups that need to be completed, and may even dynamically load some libraries. In a ROM system, dynamic loading is (usually) not supported, so you won't be doing that. But you still need a raw binary image (often in a HEX format suitable for a PROM programmer of some form), so you will need to use the objcopy utility from binutil to transform the linker output to a suitable format.
So, to answer the actual question you asked...
You use a linker script to specify the target addresses of each section of your program's image. In that script, you have several options for dealing with device registers, but all of them involve putting the text, data, bss stack, and heap segments in address ranges that avoid the hardware registers. There are also mechanisms available that can make sure that ld throws an error if you overfill your ROM or RAM, and you should use those as well.
Actually getting the device addresses into your C code can be done with #define as in your example, or by declaring a symbol directly in the linker script that is resolved to the base address of the registers, with a matching extern declaration in a C header file.
Although it is possible to use GCC's section attribute to define an instance of an uninitialized struct as being located in a specific section (such as FPGA_REGS), I have found that not to work well in real systems. It can create maintenance issues, and it becomes an expensive way to describe the full register map of the on-chip devices. If you use that technique, the linker script would then be responsible for mapping FPGA_REGS to its correct address.
In any case, you are going to need to get a good understanding of object file concepts such as "sections" (specifically the text, data, and bss sections at minimum), and may need to chase down details that bridge the gap between hardware and software such as the interrupt vector table, interrupt priorities, supervisor vs. user modes (or rings 0 to 3 on x86 variants) and the like.
Typically these addresses are beyond the reach of your process. So, your linker wouldn't dare put stuff there.
If the memory location has a special meaning on your architecture, the compiler should know that and not put any variables there. That would be similar to the IO mapped space on most architectures. It has no knowledge that you're using it to store values, it just knows that normal variables shouldn't go there. Many embedded compilers support language extensions that allow you to declare variables and functions at specific locations, usually using #pragma. Also, generally the way I've seen people implement the sort of memory mapping you're trying to do is to declare an int at the desired memory location, then just treat it as a global variable. Alternately, you could declare a pointer to an int and initialize it to that address. Both of these provide more type safety than a macro.
To expand on litb's answer, you can also use the --just-symbols={symbolfile} option to define several symbols, in case you have more than a couple of memory-mapped devices. The symbol file needs to be in the format
symbolname1 = address;
symbolname2 = address;
...
(The spaces around the equals sign seem to be required.)
Often, for embedded software, you can define within the linker file one area of RAM for linker-assigned variables, and a separate area for variables at absolute locations, which the linker won't touch.
Failing to do this should cause a linker error, as it should spot that it's trying to place a variable at a location already being used by a variable with absolute address.
This depends a bit on what OS you are using. I'm guessing you are using something like DOS or vxWorks. Generally the system will have certian areas of the memory space reserved for hardware, and compilers for that platform will always be smart enough to avoid those areas for their own allocations. Otherwise you'd be continually writing random garbage to disk or line printers when you meant to be accessing variables.
In case something else was confusing you, I should also point out that #define is a preprocessor directive. No code gets generated for that. It just tells the compiler to textually replace any foo_reg it sees in your source file with *(int *)0x100. It is no different than just typing *(int *)0x100 in yourself everywhere you had foo_reg, other than it may look cleaner.
What I'd probably do instead (in a modern C compiler) is:
// access register 'foo_reg', which is located at address 0x100
const int* foo_reg = (int *)0x100;
*foo_reg = 1; // write to foo_regint
x = *foo_reg; // read from foo_reg

Resources