I am new to the ARM architecture. I want to know if the process memory layout of ARM Linux(Commonly used version) is any different from the x86 Linux. Does the rules of .text, .bss, .data, .stack, .heap are any different in ARM. Does it support virtual memory and paging, if yes from where the swap space is used as flash memory is read-only i guess. ARM Linux kernel has MMU enabled. Appreciate any help.
Related
I would like to understand how the MMIO works on ARM architecture.
I realized that ARM provides 1:1 mapping from physical address to specific peripheral.
For example, to manage the GPIOX on arm, for example in Raspberry Pi, the processor accesses the specific physical addresses (seems that preconfigured by the manufacturer?) without configuring some registers beforehand.
I thought that there are some specific BAR register that arbitrates the read/write request to specific physical addresses to peripherals. However, when I check the spec for BCM2835 (Raspberry pi 3), the physical addresses are translated by another MMU called VC/ARM MMU. Is it a common design to have another MMU that translates the physical addresses to bus addresses in ARM architecture?
Also, I was wondering how the SMMU (IOMMU in x86) is utilized in this concept. I found one article mentioning that the VC/ARM MMU is an example of SMMU but I think that is not true? When I check some monitor code & kernel driver code implementation (not raspberry pi), it seems that the SMMU is also mapped to specific physical addresses and the monitor/kernel uses those addresses to initialize and communicate with SMMU. If the arm architecture utilize VC/ARM MMU as an SMMU, how that physical address mapping for SMMU itself can be accessed to initialize the SMMU..?
Lastly, I thought that all peripherals are managed by the SMMU. If some peripherals are always mapped to fixed physical addresses, what is the role of SMMU? Why some peripherals are communicated with PE (CPU) through fixed physical addresses, and some are communicated with SMMU..? How exactly the peripherals and PE (Processors) can communicate in ARM architecture..?
I am just answering your questions. There are many things wrong with some assumptions you have, which Old Timer tries to clarify.
Is it a common design to have another MMU that translates the physical addresses to bus addresses in ARM architecture?
No. Broadcom has an architecture license. They designed there own HDL code for the ARM CPU and system details can be different. In fact, it is quite common for even direct license (using ARM HDL) that the systems differ from vendor to vendor.
If the arm architecture utilize VC/ARM MMU as an SMMU, how that physical address mapping for SMMU itself can be accessed to initialize the SMMU?
If some peripherals are always mapped to fixed physical addresses, what is the role of SMMU? Why some peripherals are communicated with PE (CPU) through fixed physical addresses, and some are communicated with SMMU..? How exactly the peripherals and PE (Processors) can communicate in ARM architecture..?
The typical solution to this is 'TrustZone'. Here, the access is defined as having a 'secure' or 'normal' access. The master (CPU) tags the access and either the peripheral or a bus access controller permits or prohibits access to the peripheral. These are best statically mapped at boot time and locked.
By defining permitted use cases, the peripheral/master access patterns can be defined for the system. The complication is 'dynamic' peripherals and masters. The ARM TrustZone CPU is dynamic. A 'world switch' changes the CPU access and there are attacks on the communication interface between 'normal' and 'secure' worlds.
The ARM AXI/AHB buses were originally designed for embedded devices. PCs in contrast have dynamic buses ISA->EISA->PCMIA->PCI (etc.) These addresses are typically dynamic. Note However, this bus structure has an expense. So some of your question is like asking why isn't ARM just like an x86 PC. They had different goals and they are different. You can't put a new grahpics card into your cell phone.
Reference:
Handling ARM Trustzones
Explanation of arm Bus architecture
ARM Differences from PC
Note: It is from the modular nature of the PC system which was part of why the PC dominated Apple, Commodore, Atari, etc. in my opinion (the other aspect was piracy), contrary to others who think the answer is Microsoft. The hardware matters.
I am trying to understand the differences in the C/C++ compilation process (Compiler/linker/locator etc..) between a microcontroller and a microprocessor.
For example, for a microcontroller, we can provide the linker script to specify the actual physical memory location the program should be executed. However, in a microprocessor where there are multiple programs running, we are unable to provide the actual addresses to load the program.
I would like to know how this compilation handles in a microprocessor and a microcontroller.
Thanks a lot!
I am trying to understand the differences in the C/C++ compilation process (Compiler/linker/locator etc..) between a microcontroller and a microprocessor.
None by itself.
The difference might be that on a microcontroller you typically don't have an operating system that supports any run-time loading of shared libraries, but that's not necessarily the case (NuttX and others).
For example, for a microcontroller, we can provide the linker script to specify the actual physical memory location the program should be executed.
You can do the same with a microprocessor.
You're trying to make a distinction where there is none: a microcontroller is just a microprocessor with an embedded target market and typically, integrated memory in-package. That's it.
However, in a microprocessor where there are multiple programs running,
this can (and doesn't have to) be the case on both microcontrollers and microprocessors.
You mean "on microprocessors, we typically use a multitasking operating system. Can we, on such an operating system..."
we are unable to provide the actual addresses to load the program.
That's not true. Often, such operating systems offer address space randomization and you can compile relocateable code – but the same can (and is!) done for microcontrollers.
The terms microcontroller and microprocessor have no reliable definition, they are used rather randomly and intermixed by different engineers and manufacturers. The consensus is that microcontrollers are simpler, have less resources and are meant more for "real-time-y" embedded tasks. Microprocessor are slightly more complex, have more resources and are meant more for "general" tasks. Various terms like MMU, embedded Flash/RAM, external Flash/RAM are thrown around. If this sounds vague - it is. Don't rely on those terms.
You need to look at specific features on a micro which enable your abilities as a software engineer. The most basic one is MMU - this defines if you can have virtual memory or not. This in turn defines whether it supports an OS which runs processes in separate memory regions or it's all one big continuous pile of memory with addressing hardwired (in which case you still get an OS but it has much less to do). Linking depends a lot on that distinction.
Then a system which runs processes in isolated memory regions typically needs to load the process code into RAM before executing it, which requires (much) more RAM, which is typically solved by an external RAM chip, which requires an MMU.
But the classic definition of a microcontroller is: no MMU, embedded Flash and/or RAM. Classic microprocessor is: MMU, external storage and RAM. There are more exceptions than rules.
As I know, CPU can access RAM directly. Device RAM is empty on start and CPU don't know from where to load bootloader into the RAM for executing it. Even it can do nothing because call stack should be empty too as I think.
Yet how is bootloader program copied into the RAM for further execution?
This should happening with embedded devices such as smartphones. On a x86 PCs BIOS is responsible for loading MBR section from disk to RAM as I know.
A bootloader in RAM is a secondary bootloader; invariably there is code in a ROM of some kind containing a primary bootstrap that loads the secondary bootstrap. Often that ROM is mask-ROM on the chip it self.
Typically on an ARM application processor such as a Cortex-A the primary bootstrap will load code to RAM from NAND flash or SD card. ARM Cortex-M often run code directly from ROM in any case.
you have the same problem with an x86. and the same solution in general (as with any other processor as well), you put a rom/flash in the address space where the processor boots and/or where its vector table is.
there are some other solutions like having other logic that reads from some non volatile storage and places it in the boot/vector space, or other logic that provides an interface for some other processor/computer to download into the board/chip and then release reset.
I was reading this informative blog post here about the booting process on x64. In what ways is the booting process on ARM different? I had a look at Raspberry pi and it seems that the GPU is what executes before control is handed over to ARM processor. Are there any similar resources you have come across for ARM processors?
Just like the x86 boot process is documented in documents available at intel.com arm boot process is documented at arm.com as is any other processor documented.
The full sized (non-cortex-m) arm cores start by executing at address zero for reset. There is one instruction location for reset, one for data abort, undefined instruction, etc. Similar to an interrupt vector table but instead of an address there is an instruction there ideally a branch.
processors historically have some non-volatile ram mapped into the boot space or vector table or whatever, then volatile ram if any elsewhere. x86 historically near the top of ram, arm at the bottom.
ARM does not make chips like intel, it designs processor cores which other folks that make chips include in their chip designs instead of having to design their own cores and maintain compilers, etc. So the chip vendor can solve the boot process in a number of ways, some have something non-volatile mapped low then after booting you can swap in ram to that address space, whatever. In the case of the Raspberry Pi chips which are made by broadcom, they have their own gpu which actually boots the chip, it eventually reads a file assumed to be the linux kernel and root file system, but doesnt have to be. It places that file in ram (by default) in the place where a linux kernel would be loaded by a bootloader like redboot or uboot, in this case by the gpu's arm loader. the gpu then places a few breadcrumbs including the reset instruction(s) required to branch into the linux kernel (generally a very trivial thing), then release reset on the arm core allowing it to boot. so basically the arm sees only ram which is kind of nice, but that is somewhat atypical for arm or other processors to do that. Normally the main processor boots from non-volatile storage (eeprom, flash, etc) and then itself loads linux or whatever and branches to it.
A number of other processor types will have an interrupt vector table, including the arm cortex-m series which are thumb instruction set only cores. they are designed to be microcontrollers and not carry as much overhead, so the first address slot is actually meant to be filled with the init value for the stack pointer, then the second is the address for reset, and then a zillion others. The hardware is designed to preserve registers for you so that you can have the address to C functions right in the table and not have to have "some assembly required" wrappers written by you or the folks that ported the toolchain to this platform. Other processor types will just have a vector table and some assembly is required to be wrapped around interrupts and such.
In ARM microprocessors, is the only available memory space the 37 or so general and status registers, or is there a separate accessible memory space within the microprocessor chip?
For example, in the Atmel AVR microcontroller, to my understanding, the memory is mapped internally within the same chip, with data memory, program memory (containing program memory) and EEPROM memory. Does the same apply to ARM microprocessors, or does a microcontroller with an ARM microprocessor require separate external memory?
Your interpretation of the Atmel AVR architecture is not quite correct.
Of course it's possible to integrate memory of virtually any kind on the same die as the CPU core. However, that doesn't mean you can compare flash memory available on one such integrated system to registers on another.
A CPU core needs a memory interface and that's all that counts: Flash is slower than registers. So if you connect Flash to an ARM processor it will behave similar (in the same order of magniture regarding speed) as the on-board Flash of the AVR.
Besides, ARM is solely an IP (design concept) and licenced by numerous companies which build efficient peripherals and sometimes also memory around the core. So you will find chips with an ARM core and on-board memory on the market.
(I simplified things a bit in the above description but I was focusing on trying to point out where I think you misunderstand how the two processors compare.)
Below link talks a lot about how memory management is done in ARM processor. Hope it helps
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0471c/CHDDJIFI.html