How ARM CPU loading bootloader?

How ARM CPU loading bootloader? - arm

As I know, CPU can access RAM directly. Device RAM is empty on start and CPU don't know from where to load bootloader into the RAM for executing it. Even it can do nothing because call stack should be empty too as I think.
Yet how is bootloader program copied into the RAM for further execution?
This should happening with embedded devices such as smartphones. On a x86 PCs BIOS is responsible for loading MBR section from disk to RAM as I know.

A bootloader in RAM is a secondary bootloader; invariably there is code in a ROM of some kind containing a primary bootstrap that loads the secondary bootstrap. Often that ROM is mask-ROM on the chip it self.
Typically on an ARM application processor such as a Cortex-A the primary bootstrap will load code to RAM from NAND flash or SD card. ARM Cortex-M often run code directly from ROM in any case.

you have the same problem with an x86. and the same solution in general (as with any other processor as well), you put a rom/flash in the address space where the processor boots and/or where its vector table is.
there are some other solutions like having other logic that reads from some non volatile storage and places it in the boot/vector space, or other logic that provides an interface for some other processor/computer to download into the board/chip and then release reset.

Related

ARM Linux reboot process [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
How reboot procedure works on ARM SOCs running Linux, e.g do boot loaders reinitialize DDR memory? can anybody please explain me rebooting process in detail.

How reboot procedure works on ARM SOCs running Linux, ... ?
The typical ARM processor in use today is integrated with peripherals on a single IC called a SoC, system on a chip. Typically the reboot procedure is nearly identical to a power-on boot procedure. On a reset the ARM processor typically jumps to address 0.
Main memory, e.g. DRAM, and non-volatile storage, e.g. NAND flash, are typically external to the SoC (that is Linux capable) for maximum design flexibility.
But typically there is a small (perhaps 128KB) embedded ROM (read-only memory) to initialize the minimal system components (e.g. clocks, external memories) to begin bootstrap operations. A processor reset will cause execution of this boot ROM. (This ROM is truly read-only, and cannot be modified. The code is masked into the silicon during chip fabrication.)
The SoC may have a strapping option to instead execute an external boot memory, such as NOR flash or EEPROM, which can be directly executed (i.e. XIP, execute in place).
The salient characteristic of any ROM, flash, and SRAM that the first-stage boot program uses is that these memories must be accessible immediately after a reset.
One of the problems of bootstrapping a system that uses DRAM for main memory is its hardware initialization. The DRAM memory controller has to be initialized with board-specific parameters before code can be loaded into DRAM and executed. So from where does this board-specific initialization code execute, since it can't be in main memory?
Each vendor has their own solution.
Some require memory configuration data to be stored in nonvolatile memory for the boot ROM to access.
Some SoCs have integrated SRAM (which does not require initialization like DRAM) to execute a small second-stage bootstrap program.
Some SoCs use NOR flash to hold a XIP (execute in place) bootstrap program (e.g. the SPL program of U-Boot).
Each SoC vendor has its own bootstrap method to get the OS loaded and executing.
Some use hardware strapping read through GPIO pins to determine the source of the next stage of the bootstrap sequence.
Another vendor may use an ordered list of memories and devices to probe for a bootstrap program.
Another technique, is to branch to firmware in NOR flash, which can be directly executed (i.e. XIP, execute in place).
Once the bootstrap program has initialized the DRAM, then this main memory can be used to load the next stage of booting. That could be a sophisticated boot utility such as U-Boot, or (if the bootstrap program is capable) the Linux kernel. A ROM boot program could do everything to load an ARM Linux kernel (e.g. ETRAX), but more common is that there will be several bootstrap programs or stages that have be performed between processor reset to execution of the OS.
The requirements of booting the Linux ARM kernel are spelled out in the following document: Booting ARM Linux
Older versions of Linux ARM used the ATAGs list to pass basic configuration information to the kernel. Modern versions provide a complete board configuration using a compiled binary of a Device Tree.
... e.g do boot loaders reinitialize DDR memory?
Of the few examples that I have seen, the boot programs unconditionally configure the dynamic RAM controller.
PCs have a BIOS and Power On Self Tests, aka POST. The execution of POST is the primary difference between a power-on reset (aka cold boot) versus a software reset (aka warm boot or reboot). ARM systems typically do not perform POST, so you typically will see minimal to no difference between types of reset.

This is way too broad. It's not only SoC vendor dependent, but also hardware and software dependent.
However, the most typical setup is:
CPU executes first-stage bootloader (FSB).
FSB is located on the chip itself in ROM or EEPROM and is very small (AT91RM9200 FSB is 10kB max, AFAIR). FSB then initializes minimum set of peripherals (clocks, RAM, flash), transfers second-stage bootloader (U-Boot) to RAM, and executes it.
U-Boot starts.
U-Boot initializes some other hardware (serial, ethernet, etc), transfers Linux kernel to RAM, prepares the pointer to kernel input parameters and jumps into it's entry point.
Linux kernel starts.
Magic happens here. The system now able to serve you cookies via SSH console and/or executes whatever needs to be executed.
A bit more in-depth info about warm start:
Warm start is a software reset, while cold start is power-on or hardware reset. Some (most?) SoC's are able to pass the info to FSB/SSB about warm start. This way bootloaders are able to minimize the overall boot time by skipping re-initializion of already initialized peripherals.
Again, this is most typical setup from my 15+ years experience in embedded world.

It varies a lot depending on the SoC. I'll describe something like a "typical" one (Freescale iMX6)...
Typically an on-chip Watchdog Timer is used to reset the SoC cleanly. Sometimes, an external Power Management IC can be provoked to perform a board-wide reset (this method may be better, as it avoids the risk of external chips getting "stuck" in an unexpected state, but not all board designs support it).
Upon reset, the SoC will start its normal boot process: checking option pins, fuse settings and initializing clocks and the boot device (e.g. eMMC). This is typically controlled by CPU code executing from a small on-chip ROM.
Either the internal boot ROM will initialize DDR SDRAM (using settings taken from fuses or read from a file on the boot device), or the bootloader gets loaded into internal RAM then it takes care of DDR initialization (and other things). The U-Boot bootloader can be configured to work either way.
Finally, the kernel and DTB are loaded into memory and started.

note uboot, etc are not required they are GROSS overkill, they are operating systems in their own right. to load and run linux you need memory up and running copy the kernel branch to it with some registers set to point at tables that you setup or copied from flash along with the kernel.
What you do on a cold reset or warm is up to you, same chip and board no reason necessarily why any two solutions have to do the exact same thing unless it is driven by hardware (if you do a wdt reset to start over and that reset wipes out the whole chip including the ddr controller). You just have to put the system in the same state that linux expects.

is there any MCU based on cortex-A who's on-chip mask primary bootloader can be changed?

I want study cortex-A inside. but AM335X and S5PV210's inside flash can not be changed, so I want to know if there is there any MCU based on cortex-A who's on-chip mask primary bootloader can be changed?
please recommend some for me, if there has.
please forgive my pool English, thank you!

There is usually no flash in a Cortex-A. The ROM code is usually, well.., in a Read Only Memory. When there is a bug in this code, you need to produce a new wafer mask to fix it, but, as millions of parts are produced, the cost reduction is significant, and ROM avoid data retention issues.

Flash memory is by definition re-writable; the parts you mentioned simply have no on-chip flash.
Parts that have on-chip flash typically execute code directly from it, so because of the relatively low speed of flash memory it is usually used on lower frequency processors sub-200KHz.
Fast "applications" processors do not normally have on-chip flash because it takes up a large amount of die space and and has insufficient capacity to support the for the kind of applications and OS (such as Linux, Android or Windows) typically used on such processors. Instead they often have an on-chip mask ROM primary bootloader than loads a secondary bootloader from external media such as NOR flash, NAND flash, SD card, eMMC etc. The secondary bootloader then boots the OS and/or application code.
Code on such processors is loaded to and executed from SDRAM which is much faster than flash. Also the boot media is not always memory mapped so cannot be executed directly in any case.

Boot process in android devices

I was reading this informative blog post here about the booting process on x64. In what ways is the booting process on ARM different? I had a look at Raspberry pi and it seems that the GPU is what executes before control is handed over to ARM processor. Are there any similar resources you have come across for ARM processors?

Just like the x86 boot process is documented in documents available at intel.com arm boot process is documented at arm.com as is any other processor documented.
The full sized (non-cortex-m) arm cores start by executing at address zero for reset. There is one instruction location for reset, one for data abort, undefined instruction, etc. Similar to an interrupt vector table but instead of an address there is an instruction there ideally a branch.
processors historically have some non-volatile ram mapped into the boot space or vector table or whatever, then volatile ram if any elsewhere. x86 historically near the top of ram, arm at the bottom.
ARM does not make chips like intel, it designs processor cores which other folks that make chips include in their chip designs instead of having to design their own cores and maintain compilers, etc. So the chip vendor can solve the boot process in a number of ways, some have something non-volatile mapped low then after booting you can swap in ram to that address space, whatever. In the case of the Raspberry Pi chips which are made by broadcom, they have their own gpu which actually boots the chip, it eventually reads a file assumed to be the linux kernel and root file system, but doesnt have to be. It places that file in ram (by default) in the place where a linux kernel would be loaded by a bootloader like redboot or uboot, in this case by the gpu's arm loader. the gpu then places a few breadcrumbs including the reset instruction(s) required to branch into the linux kernel (generally a very trivial thing), then release reset on the arm core allowing it to boot. so basically the arm sees only ram which is kind of nice, but that is somewhat atypical for arm or other processors to do that. Normally the main processor boots from non-volatile storage (eeprom, flash, etc) and then itself loads linux or whatever and branches to it.
A number of other processor types will have an interrupt vector table, including the arm cortex-m series which are thumb instruction set only cores. they are designed to be microcontrollers and not carry as much overhead, so the first address slot is actually meant to be filled with the init value for the stack pointer, then the second is the address for reset, and then a zillion others. The hardware is designed to preserve registers for you so that you can have the address to C functions right in the table and not have to have "some assembly required" wrappers written by you or the folks that ported the toolchain to this platform. Other processor types will just have a vector table and some assembly is required to be wrapped around interrupts and such.

Memory space of ARM microprocessors

In ARM microprocessors, is the only available memory space the 37 or so general and status registers, or is there a separate accessible memory space within the microprocessor chip?
For example, in the Atmel AVR microcontroller, to my understanding, the memory is mapped internally within the same chip, with data memory, program memory (containing program memory) and EEPROM memory. Does the same apply to ARM microprocessors, or does a microcontroller with an ARM microprocessor require separate external memory?

Your interpretation of the Atmel AVR architecture is not quite correct.
Of course it's possible to integrate memory of virtually any kind on the same die as the CPU core. However, that doesn't mean you can compare flash memory available on one such integrated system to registers on another.
A CPU core needs a memory interface and that's all that counts: Flash is slower than registers. So if you connect Flash to an ARM processor it will behave similar (in the same order of magniture regarding speed) as the on-board Flash of the AVR.
Besides, ARM is solely an IP (design concept) and licenced by numerous companies which build efficient peripherals and sometimes also memory around the core. So you will find chips with an ARM core and on-board memory on the market.
(I simplified things a bit in the above description but I was focusing on trying to point out where I think you misunderstand how the two processors compare.)

Below link talks a lot about how memory management is done in ARM processor. Hope it helps
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0471c/CHDDJIFI.html

Can I run a program from a SD memory instead of flash on an evaluation board (embedded programming)?

I have an evaluation board (Olimex STM32-P103) which supports a SD-card connector. I want to put my program in to a SD memory instead of internal flash of the micro-controler; and run it from there.
I don't know if it is possible to do that according to boot-loader issue!
P.S my goal is running linux on this board and then port my application over it.

To run programs from SD-Card in general you should know that you can't run them "right away". This means, you have to load it in a executable memory somewhere in your address space which is done by a (more or less) simple bootloader. In the simplest instance, the bootloader is capable to read from a SD-Card a specific binary and copy it into the memory.
That being said you should think about this considering you only got 20k of RAM and 128k of Flash on your board. So where should your program go? Or better: Why not flashing the program in the 128k of Flash from the very beginning? Especially you should know that Linux is a bit "hungry" in terms of memory.
If your goal is to run a "normal" Linux on this board, I'm afraid you're screwed. This because from what I know Linux needs a MMU to run and the chip on this board does not provide one (as far as researchable without access to datasheets from ST).
If you're lucky you can go with uCLinux. I'm not sure if a finished port exists for the STM32 but it seems there are some resources based on a short google search for "STM32 uCLinux". But even if you manage to run uCLinux I'm afraid there's not much left in your system for your application, so the result might be a bit disappointing.
Depending on why you are looking for Linux running on this MCU, there are maybe other solutions like a FreeRTOS in combination with a lwIP-stack (if networking is needed) or a FAT library like FullFAT if you are looking for reading SD-Cards and stuff.
Edit: One thing i'd like to add is that booting from the SD-Card is typically something you do with "bigger" (not much but slightly) systems where you have enough RAM to keep the whole image you'd like to run in it and still have some space left for the data you want to process.

You're going to have to have some code in the STM's onboard flash (typically called a "boot loader") that implements this since the "bare metal" very likely can't boot from SD card.
You're going to have to build that code, which figures out how to use the STM's onboard peripherals to talk to the SD card, finds the file you want to run in the file system (which you also have to implement), and loads it.
I wanted to include a link to the STM standard peripheral library, but it seems to be down (being moved). :/

The data on the SD card is not memory mapped, so cannot be executed directly.
It is possible to dynamically load the data from the card into RAM for execution. WindRiver's VxWorks RTOS supports loading and linking object modules dynamically, I know of no other OS that would scale to a Cortex-M that directly supports that but it would be possible to write your own.
However, I would suggest that in the case of the microcontroller you are using the idea is ill-advised; optimal performance on Cortex-M is achieved when the code is in on-chip flash and data in RAM allowing the data and instruction to be fetch to occur simultaneously on the separate buses (Harvard architecture). If you execute the code from RAM the performance will be severely hit since then data and instructions must be fetched sequentially over the same bus.
The board is entirely unsuited to running Linux, with only 128K Bytes Program Flash, and 20K Bytes RAM is is not at all feasible. Even the smallest Linux distribution requires 600Kb RAM plus whatever is needed by application code. uClinux can just about run on higher-end STM32 with external RAM and Flash, but that would suffer from the same bus contention performance hit and Linux without an MMU is rather missing the one major benefit of using Linux at all. The part on your board lacks an external memory interface, so cannot be expanded to support Linux.
If you need an OS consider a RTOS such as uC/OS-II, FreeRTOS, or emBOS for example.

AS other says you cannot directly execute your code directly from the SD CARD.
But like those "linux board", you can load the stored kernel/programm into an external SDRAM that can be mapped and execute it from there.
You'll still need to write that "bootloader" and store it in the internal flash.
That'is a lot work to my opinion, for limited application.
If you want to write your application in a linux environnement then port it suck small target, I would rather design my application using dependency injection, or even use an emulator.