Introduction in ARM A7 MMU - arm

I'm trying to understand the ARMv7 MMU of the BCM2836 (Raspberry PI 2).
For starting points I already consulted the ARM ARM & TRM.
But somehow it's too much information there and thats confusing.
Does anyone knows any good tutorial about the MMU.
In particular I want to map the lowest 1MB virtual adresses (0x0 - 0x100000) to the highest 1MB physical addresses (0xFFF00000 - 0xFFFFFFFF).
Thank you for your help!

Related

ARM - Memory map leakages

Lets assume that we are using MCU with ARM Cortex-M4, 256KB of FLASH and 64KB of RAM. This CPU contains memory map like showed below:
As I understand it correctly, the memory map tells us what are the maximum sizes of memories, that limits MCU vendor and where that CPU will look for it. For example, we cannot use Cortex-M4 with FLASH memory above 512MB, right?
In that situation, we have 64KB of RAM, and the limit is 512MB. My question is - does CPU know about that? Does it have any safety mechanisms, that will avoid trying to access beyond that 64KB of RAM (stack overflow) by halting in any way? Or maybe the CPU will work in way like "I have that boundaries, I will move around these if necessary". I know, that compilers may provide some information, that can aware the programmer.
As I understand it correctly, the memory map tells us what are the maximum sizes of memories, that limits MCU vendor and where that CPU will look for it.
Yes.
For example, we cannot use Cortex-M4 with FLASH memory above 512MB, right?
Normally the flash is the part between address 0x0 and 0x1FFFFFFF. Meaning 512MB indeed (1024*1024*512=0x20000000). Which is a ridiculously large size for a Cortex M.
My question is - does CPU know about that?
Yes and no. The physical memory will exist where the vendor placed it. This can at some extent be remapped through the linker script.
The Cortex M does not have an advanced MMU/MPU with support virtual memory, meaning all memory is physical addresses. It does however keep track of various invalid accesses through hardware exceptions. From ARM/Keil AN209 Using Cortex-M3/M4/M7 Fault Exceptions:
Fault exception handlers
Fault exceptions trap illegal memory accesses and illegal program behavior. The following conditions are detected by fault exception handlers:
HardFault: is the default exception and can be triggered because of an error during exception processing, or because an exception cannot be managed by any other exception mechanism.
MemManage: detects memory access violations to regions that are defined in the Memory Management Unit (MPU); for example, code execution from a memory region with read/write access only.
BusFault: detects memory access errors on instruction fetch, data read/write, interrupt vector fetch, and register stacking (save/restore) on interrupt (entry/exit).
UsageFault: detects execution of undefined instructions, unaligned memory access for load/store multiple. When enabled, divide-by-zero and other unaligned memory accesses are detected.
No the CPU does not know - you specify the memory map in the linker script, and the link will fail if your code and/or data cannot be located in the stated available memory.
If you specify the memory map incorrectly, the linker may locate code/data in non-existent memory and when you load it, parts will be missing. For the flash programming very likely the programming tool will fail if it is set to read-back verify the code.
Also if you dynamically load code to non existent memory, or access memory not allocated by the linker at run-time, the results are non-deterministic, other than it won't do anything useful.
The CPU cannot know as everyone has said. The MCU vendor buys the processor ip from arm, as well as ip from other vendors as well as creates some of their own if nothing else the glue that holds the modules together. The flash itself is likely from some third party.
Some chip designers wrap around, this is not uncommon in hardware or software, for example the part may have 16Kbytes starting at 0x08000000 this is the CHIP companies decision ARM has little to do with it other than what you have found that they define wide ranges (likely for caching and other options within their domain). 16K is 16384 bytes or 0x4000 so 14 bits of address. There is likely an address decoder that sees some number of upper bits 0x08...and sends that request to the flash logic, then at the flash logic it would not suprise me to see the lower 14 address bits stripped off and used meaning if you were to address 0x08000000 and 0x08008000 you may get the same 0x0000 offset/address in the flash.
Some engineers may choose to look at those upper bits and declare a fault.
You have to examine this on a case by case basis not just an stm32 for example but each family of stm32, for every datasheet basically. (And there is no reason to expect this level of detail is documented by the chip vendor).
The arm cortex-m as with all processors are very very stupid they do what the bits tell them to do it is our responsibility to feed the a sequential trail of working instructions, just like laying track in front of a train you can lay a lot of track in the wrong place, with gaps, etc. If not per the rules of the train then the train will crash or fail in some way.
The others have mentioned the linker script, and to be clear the linker script does not just magically somehow know what chip you have, ultimately you, the programmer are responsible for telling the toolchain to build programs that follow the rules of the cpu AND CHIP, to be successful. So the right architecture instructions (or a subset, cortex-m0 instructions (armv6m will run on a cortex-m4 (armv7m)). And the linker script needs to define addresses for read only and read write areas that match the chip (not the core, the chip as they are in charge of that definition). And then barring 100 other ways you can fail. It will run.
You are ultimately responsible but most folks grab an sdk or sandbox of some sort and hope for the best, blind faith in others. Gnu and llvm tools are fully capable to be used by you directly without these third parties, but then you are fully responsible for getting everything right.

General purpose registers of ARM microcontroller Doubts

I am a new-commer to the field of ARM micro-controllers and i was studying about and have the following doubt:
Are General purpose registers of ARM a part of its SRAM or not?
No, they're not, they're registers within the processor itself (which may be implemented in SRAM,) but they don't have addresses within the memory map.

Working with GPIO on bcm2836

I am writing a GPIO-driver for my RPI2 OS. And I was surfing really long time about it, but I found only linux data. How should I do such functions as
void gpio_set(int pin);
void gpio_clr(int pin);
in C for the driver. Or, maybe it can be done due inline assembly?
As explained here
The underlying architecture in BCM2836 is identical to BCM2835. The only significant difference is the removal of the ARM1176JZF-S processor and replacement with a quad-core Cortex-A7 cluster.
The available documentation for the BCM2836 does not detail the peripheral hardare, only the A7. Instead you need the documentation for the BCM2835. The peripheral specification section 6 deals with the GPIO. The registers are memory mapped so you can write directly to them in C.
It is very simple to implement in C. Keep in mind that the peripheral address RPi2 is 0x3F000000 instead of 0x20000000 (RPi). Documentation available is for RPi (BCM2835) but applicable on RPi2 as well with some memory address changes and processor change (Cortex-A7). For quick jump you can see valver's blog for bare-metal development.

In ARMv7, is the address used in TTBR0 and TTBR1 physical or virtual

I've been looking in the ARM Architecture Reference Manual for v7-A and v7-R in Section B3 and I can't figure out if the address used in the TTBR0 and TTBR1 registers is supposed to be a virtual or physical address.
Physical would make the most sense, but I'd like to know definitively.
So, is this address supposed to be physical or virtual?
Is it required to keep the page table location mapped as an identity address (PA == VA)?
Imagine it were a virtual address...
The CPU issues a transaction to a virtual address. In order to translate it, the MMU needs to do a table walk. For that it needs to know what bit of RAM to address on the bus, so it looks in the base register. Great, now it has the virtual base address, it just needs to translate that to a physical address to know what bit of RAM to address on the bus, so it needs to do a table walk. For that it needs... etc. etc.
In short, yes, they're definitely physical addresses. The fact that TTBRn are 64-bit on LPAE implementations is also a bit of a clue.*
Once the page tables are set up and the MMU is on, it's not required to keep them mapped at all, let alone in any particular relationship - if the data's physically there in RAM, the MMU is quite happy. The CPU only needs to map that RAM into its address space if it's updating the tables - the rest of the time they'd just be a waste of address space.
* ...and this is of course a complete lie when the Virtualisation Extensions are involved ;) In that case, they're intermediate physical addresses, and entirely subject to the whims of stage 2 translation. For which the above applies. Fun.
As per the code, Physical address of pgd is written to TTBR.
http://lxr.free-electrons.com/source/arch/arm/include/asm/proc-fns.h#L116
#define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
Physical. In this regard it is unchanged from the ARMv5.

ARM bare-metal with MMU: write to non-cachable,non-bufferable mapped area fail

I am ARM Cortex A9 CPU with 2 cores. But I just use 1 core and the other is just in a busy loop. I setup the MMU table using section (1MB per entry) like this:
0x00000000-0x14ffffff => 0x00000000-0x14ffffff (non-cachable, non-bufferable)
0x15000000-0x24ffffff => 0x15000000-0x24ffffff (cachable, bufferable)
0x25000000-0x94ffffff => 0x25000000-0x94ffffff (non-cachable, non-bufferable)
0x15000000-0x24ffffff => 0x95000000-0xa4ffffff (non-cachable, non-bufferable)
0xa5000000-0xffffffff => 0xa5000000-0xffffffff (non-cachable, non-bufferable)
It is rather simple. I just want to have a mirror of 256MB memory for non-cachable access. However, when I do several write to the the non-cachable memory section at 0x95000000-0xa4ffffff. I find the write is not actually written until I explicitly give a cache flush.
Am I doing something wrong or this kind of mapping is not valid? If that is the case, I don't understand how Linux's ioremap will be working on ARM. It will be good if anyone can give some explanation to me. Thanks very much.
First of all: the Cortex-A9 is an ARMv7-A processor. The terms non-cacheable/non-bufferable/cacheable/bufferable are no longer correct descriptions of the mappings.
The actual mapping type is determined by TEX[2:0], C and B bits.
So I am actually having to guess a bit here as to what your mappings actually are.
And my guess is that you have the majority of your mappings set as Strongly-ordered, and the mirrored region as Normal Write-Back cacheable.
Having multiple virtual mappings with different memory types pointing to the same physical location is generally not a good idea in the ARM architecture. It used to be explicitly banned, but the latest version of the ARMv7-AR Architecture reference manual (DDI 0406C.b) has a (fairly long) section dedicated to the implications of "Mismatched memory attributes".
I would recommend finding a different way of achieving your goal.
Simply changing the mapping of the uncached regions to Normal Non-cacheable would be a good start. There is no valid reason for using Strongly-ordered mappings for RAM.

Resources