QEMU how pcie_host converts physical address to pcie address - c

I am learning the implementations of QEMU. Here I got a question: As we know, in real hardware, when cpu reads the virtual address which is the address of pci devices, pci host will take the responsibility to convert it to address of pci. And QEMU, provides pcie_host.c to imitate pcie host. In this file, pcie_mmcfg_data_write is implemented, but nothing about the conversion of physical address to pci address.
I do a test in QEMU using gdb:
firstly, I add edu device, which is a very simple pci device, into qemu.
When I try to open Memory Space Enable, (Mem- to Mem+):septic -s 00:02.0 04.b=2, qemu stop in function pcie_mmcfg_data_write.
static void pcie_mmcfg_data_write(void *opaque, hwaddr mmcfg_addr,
uint64_t val, unsigned len)
{
PCIExpressHost *e = opaque;
PCIBus *s = e->pci.bus;
PCIDevice *pci_dev = pcie_dev_find_by_mmcfg_addr(s, mmcfg_addr);
uint32_t addr;
uint32_t limit;
if (!pci_dev) {
return;
}
addr = PCIE_MMCFG_CONFOFFSET(mmcfg_addr);
limit = pci_config_size(pci_dev);
pci_host_config_write_common(pci_dev, addr, limit, val, len);
}
It is obvious that pcie host uses this function to find device and do the thing.
Use bt can get:
#0 pcie_mmcfg_data_write
(opaque=0xaaaaac573f10, mmcfg_addr=65540, val=2, len=1)
at hw/pci/pcie_host.c:39
#1 0x0000aaaaaae4e8a8 in memory_region_write_accessor
(mr=0xaaaaac574520, addr=65540, value=0xffffe14703e8, size=1, shift=0, mask=255, attrs=...)
at /home/mrzleo/Desktop/qemu/memory.c:483
#2 0x0000aaaaaae4eb14 in access_with_adjusted_size
(addr=65540, value=0xffffe14703e8, size=1, access_size_min=1, access_size_max=4, access_fn=
0xaaaaaae4e7c0 <memory_region_write_accessor>, mr=0xaaaaac574520, attrs=...) at /home/mrzleo/Desktop/qemu/memory.c:544
#3 0x0000aaaaaae51898 in memory_region_dispatch_write
(mr=0xaaaaac574520, addr=65540, data=2, op=MO_8, attrs=...)
at /home/mrzleo/Desktop/qemu/memory.c:1465
#4 0x0000aaaaaae72410 in io_writex
(env=0xaaaaac6924e0, iotlbentry=0xffff000e9b00, mmu_idx=2, val=2,
addr=18446603336758132740, retaddr=281473269319356, op=MO_8)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1084
#5 0x0000aaaaaae74854 in store_helper
(env=0xaaaaac6924e0, addr=18446603336758132740, val=2, oi=2, retaddr=281473269319356, op=MO_8)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1954
#6 0x0000aaaaaae74d78 in helper_ret_stb_mmu
(env=0xaaaaac6924e0, addr=18446603336758132740, val=2 '\002', oi=2, retaddr=281473269319356)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:2056
#7 0x0000ffff9a3b47cc in code_gen_buffer ()
#8 0x0000aaaaaae8d484 in cpu_tb_exec
(cpu=0xaaaaac688c00, itb=0xffff945691c0 <code_gen_buffer+5673332>)
at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:172
#9 0x0000aaaaaae8e4ec in cpu_loop_exec_tb
(cpu=0xaaaaac688c00, tb=0xffff945691c0 <code_gen_buffer+5673332>,
last_tb=0xffffe1470b78, tb_exit=0xffffe1470b70)
at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:619
#10 0x0000aaaaaae8e830 in cpu_exec (cpu=0xaaaaac688c00)
at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:732
#11 0x0000aaaaaae3d43c in tcg_cpu_exec (cpu=0xaaaaac688c00)
at /home/mrzleo/Desktop/qemu/cpus.c:1405
#12 0x0000aaaaaae3dd4c in qemu_tcg_cpu_thread_fn (arg=0xaaaaac688c00)
at /home/mrzleo/Desktop/qemu/cpus.c:1713
#13 0x0000aaaaab722c70 in qemu_thread_start (args=0xaaaaac715be0)
at util/qemu-thread-posix.c:519
#14 0x0000fffff5af84fc in start_thread (arg=0xffffffffe3ff)
at pthread_create.c:477
#15 0x0000fffff5a5167c in thread_start ()
at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
and I try to visit the address of edu: devmem 0x10000000
qemu stop in edu_mmio_read. use bt:
(gdb) bt
#0 edu_mmio_read
(opaque=0xaaaaae71c560, addr=0, size=4)
at hw/misc/edu.c:187
#1 0x0000aaaaaae4e5b4 in memory_region_read_accessor
(mr=0xaaaaae71ce50, addr=0, value=0xffffe2472438, size=4, shift=0, mask=4294967295, attrs=...)
at /home/mrzleo/Desktop/qemu/memory.c:434
#2 0x0000aaaaaae4eb14 in access_with_adjusted_size
(addr=0, value=0xffffe2472438, size=4, access_size_min=4, access_size_max=8, access_fn=
0xaaaaaae4e570 <memory_region_read_accessor>, mr=0xaaaaae71ce50, attrs=...)
at /home/mrzleo/Desktop/qemu/memory.c:544
#3 0x0000aaaaaae51524 in memory_region_dispatch_read1
(mr=0xaaaaae71ce50, addr=0, pval=0xffffe2472438, size=4, attrs=...)
at /home/mrzleo/Desktop/qemu/memory.c:1385
#4 0x0000aaaaaae51600 in memory_region_dispatch_read
(mr=0xaaaaae71ce50, addr=0, pval=0xffffe2472438, op=MO_32, attrs=...)
at /home/mrzleo/Desktop/qemu/memory.c:1413
#5 0x0000aaaaaae72218 in io_readx
(env=0xaaaaac6be0f0, iotlbentry=0xffff04282ec0, mmu_idx=0,
addr=281472901758976, retaddr=281473196263360, access_type=MMU_DATA_LOAD, op=MO_32)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1045
#6 0x0000aaaaaae738b0 in load_helper
(env=0xaaaaac6be0f0, addr=281472901758976, oi=32, retaddr=281473196263360,
op=MO_32, code_read=false, full_load=0xaaaaaae73c68 <full_le_ldul_mmu>)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1566
#7 0x0000aaaaaae73ca4 in full_le_ldul_mmu
(env=0xaaaaac6be0f0, addr=281472901758976, oi=32, retaddr=281473196263360)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1662
#8 0x0000aaaaaae73cd8 in helper_le_ldul_mmu
(env=0xaaaaac6be0f0, addr=281472901758976, oi=32, retaddr=281473196263360)
at /home/mrzleo/Desktop/qemu/accel/tcg/cputlb.c:1669
#9 0x0000ffff95e08824 in code_gen_buffer
()
#10 0x0000aaaaaae8d484 in cpu_tb_exec
(cpu=0xaaaaac6b4810, itb=0xffff95e086c0 <code_gen_buffer+31491700>)
at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:172
#11 0x0000aaaaaae8e4ec in cpu_loop_exec_tb
(cpu=0xaaaaac6b4810, tb=0xffff95e086c0 <code_gen_buffer+31491700>,
last_tb=0xffffe2472b78, tb_exit=0xffffe2472b70)
at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:619
#12 0x0000aaaaaae8e830 in cpu_exec
(cpu=0xaaaaac6b4810) at /home/mrzleo/Desktop/qemu/accel/tcg/cpu-exec.c:732
#13 0x0000aaaaaae3d43c in tcg_cpu_exec
(cpu=0xaaaaac6b4810) at /home/mrzleo/Desktop/qemu/cpus.c:1405
#14 0x0000aaaaaae3dd4c in qemu_tcg_cpu_thread_fn
(arg=0xaaaaac6b4810)
at /home/mrzleo/Desktop/qemu/cpus.c:1713
#15 0x0000aaaaab722c70 in qemu_thread_start (args=0xaaaaac541610) at util/qemu-thread-posix.c:519
#16 0x0000fffff5af84fc in start_thread (arg=0xffffffffe36f) at pthread_create.c:477
#17 0x0000fffff5a5167c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
It seems that qemu just locates to edu device directly, and pcie host do nothing in this procedure. I wonder whether qemu do not implements the conversion here and just use memoryRegion to achieve polymorphism? If not, how QEMU's pcie host do in this procedure?

QEMU uses a set of data structures called MemoryRegions to model the address space that a CPU sees (the detailed API is documented in part in the developer docs).
MemoryRegions can be built up into a tree, where at the "root" there is one 'container' MR which covers the whole 64-bit address space the guest CPU can see, and then MRs for blocks of RAM, devices, etc are placed into that root MR at appropriate offsets. Child MRs can also be containers which in turn contain further MRs. You can then find the MR corresponding to a given guest physical address by walking through the tree of MRs.
The tree of MemoryRegions is largely built up statically when QEMU starts (because most devices don't move around), but it can also be changed dynamically in response to guest software actions. In particular, PCI works this way. When the guest OS writes to a PCI device BAR (which is in PCI config space) this causes QEMU's PCI host controller emulation code to place the MR corresponding to the device's registers into the MemoryRegion hierarchy at the correct place and offset (depending on what address the guest wrote to the BAR, ie where it asked for it to be mapped). Once this is done, the MR for the PCI device is like any other in the tree, and the PCI host controller code doesn't need to be involved in guest accesses to it.
As a performance optimisation, QEMU doesn't actually walk down a tree of MRs for every access. Instead, we first "flatten" the tree into a data structure (a FlatView) that directly says "for this range of addresses, it will be this MR; for this range; this MR", and so on. Secondly, QEMU's TLB structure can directly cache mappings from "guest virtual address" to "specific memory region". On first access it will do an emulated guest MMU page table walk to get from the guest virtual address to the guest physical address, and then it will look that physical address up in the FlatView to find either the real host RAM or the MemoryRegion that is mapped there, and it will add the "guest VA -> this MR" mapping to the TLB cache. Future accesses will hit in the TLB and need not repeat the work of converting to a physaddr and then finding the MR in the flatmap. This is what is happening in your backtrace -- the io_readx() function is passed the guest virtual address and also the relevant part of the TLB data structure, and it can then directly find the target MR and the offset within it, so it can call memory_region_dispatch_read() to dispatch the read request to that MR's read callback function. (If this was the first access, the initial "MMU walk + FlatView lookup" work will have just been done in load_helper() before it calls io_readx().)
Obviously, all this caching also implies that QEMU tracks events which mean the cached data is no longer valid so we can throw it away (eg if the guest writes to the BAR again to unmap it or to map it somewhere else; or if the MMU settings or page tables are changed to alter the guest virtual-to-physical mapping).

Related

Hard fault RP2040 pico Zephyr

I'm using RP2040 under Zephyr and MCUboot. The final goal is to be able to update the firmware using MCUMGR over an UART bus. MCUboot use A/B seamless (dual slot memory) method to provide a safe update algorithm. When device reboot, MCUboot check if a new firmware is available and in this case boot on new firmware. To do this, a swap algorithm place the target firmware in slot 0. As this algorithm manipulate flash, some function has to be mapped inside SRAM to be sure that function doesn't erase his own code. Normally, code is directly executed from flash thanks to Direct-XIP on RP2040.
The problem is that SRAM seams to not be executable. When the program enter inside the function located in SRAM and execute the very first instruction this cause a hard fault:
0x2000c144 <flash_range_erase>: push {r4, r5, r6, r7, lr}
Fortunately, Zephyr crash's handler give some informations:
E: ***** HARD FAULT *****
E: r0/a1: 0x0003d000 r1/a2: 0x00002000 r2/a3: 0x00002000
E: r3/a4: 0x00000000 r12/ip: 0x2000c145 r14/lr: 0x100022e5
E: xpsr: 0x21000000
E: Faulting instruction address (r15/pc): 0x2000c144
E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
E: Current thread: 0x2000c3d0 (unknown)
E: Halting system
Everything seems normal and the address of the pc is correct. I strongly suspect a MPU misconfiguration which crash the program when executing code located in SRAM.
My question is:
Can MPU cause Hardfault ? How can i configure SRAM in Zephyr to execute code from SRAM ?
First, i tried to check if the function same is executable from flash. I removed the macro that indicate to located on flash.
Before:
void __no_inline_not_in_flash_func(flash_range_erase)(uint32_t flash_offs, size_t count) {
...
After:
void flash_range_erase(uint32_t flash_offs, size_t count) {
...
And ... it works ! The functions is executed as expected. I'm quite sure right now that the MPU is unhappy to let me execute code inside SRAM.
I searched informations about how to configure MPU to let me execute code in SRAM and i found this page: https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/zephyr/hardware/arch/arm_cortex_m.html
That explain how to configure fixed regions. I added the following lines in my device tree overlay:
&sram0 {
/delete-property/ compatible ;
/delete-property/ reg ;
compatible = "zephyr,memory-region", "mmio-sram";
zephyr,memory-region = "RAM_EXECUTABLE";
zephyr,memory-region-mpu = "RAM";
reg = < 0x20000000 0x10000 >; //Configure SRAM for MCUboot fixed for MPU
// RAM size has to match with BOOTLOADER_SRAM_SIZE (see menuconfig)
};
But this didn't solve the problem.
It was an issue in flash controller of the RP2040 in Zephyr.
The flash controller has to disable XiP to run flash operations (r/w). During this process the controller tried to run a function that was not linked in RAM (was in flash).
To resume, the controller tried to call a function that was in flash after disabling the flash execution.
I'll probably post a patch soon.
More infos on Zephyr's discord here:
https://discord.com/channels/720317445772017664/938474761405726800/1060917537405157446
Note in my question i tried to use the MPU but i had to disable on bootloader configuration.

How can I trace the cause of an invalid PC fault on Cortex M3?

I have an STM32 Cortex M3 that is experiencing an intermittent invalid PC (INVPC) fault. Unfortunately it takes a day or more to manifest and I don't know the cause.
I have the device paused in the debugger after the fault happened. The INVPC flag is set. The stacked registers are as follows:
0x08003555 xPSR
0x08006824 PC
0x08006824 LR
0x00000000 R12
0x08003341 R3
0x08006824 R2
0xFFFFFFFD R2
0x0000FFFF R0
Unfortunately the return address 0x08006824 is just past the end of the firmware image. The decompilation of that region is as follows:
Region$$Table$$Base
0x08006804: 08006824 $h.. DCD 134244388
0x08006808: 20000000 ... DCD 536870912
0x0800680c: 000000bc .... DCD 188
0x08006810: 08005b30 0[.. DCD 134241072
0x08006814: 080068e0 .h.. DCD 134244576
0x08006818: 200000bc ... DCD 536871100
0x0800681c: 00001a34 4... DCD 6708
0x08006820: 08005b40 #[.. DCD 134241088
Region$$Table$$Limit
** Section #2 'RW_IRAM1' (SHT_PROGBITS) [SHF_ALLOC + SHF_WRITE]
Size : 188 bytes (alignment 4)
Address: 0x20000000
I'm not sure this address is valid. The disassembly of that address in the debugger looks like nonsense, maybe data interpreted as code or something.
Is there any way I can trace this back to see where the exception happened? If necessary I can add some additional code to capture more information.
Don't sure how it works on Cortex M3, but on some other ARMs PSR register holds processor mode bits that could help you find out when it happens (in user mode, IRQ, FIQ etc). Each mode generally have it's own stack.
For user mode, if you use some RTOS with multi-tasking, you probably have many stacks for each task, but you could try to find out which task is current one (was running before crash).
When you find crashed task (or IRQ) you could try to look at it's stack for addresses of all routines and find out what was called before accident. Of course if stack was not unrecoverably corrupted.
This is what I'd start investigation from. If you find crashed task or even function but still have no idea what happens, you could make something like small circular history buffer where you write some codes on every step of your program, so you could find what it does last even if stack was destroyed.

u-boot debug using BDI2000 PowerPC4xx

I'm trying to figure out what's going on while trying to debug a U-boot port. I've got U-boot loaded on my board and by BDI2000 set-up for debug. As I step through start.S I keep running into this error:
(gdb) si
314 mtspr SPRN_SRR0,r0
(gdb) si
315 mtspr SPRN_SRR1,r0
(gdb) si
316 mtspr SPRN_CSRR0,r0
(gdb) si
317 mtspr SPRN_CSRR1,r0
(gdb) si
320 mtspr SPRN_MCSRR0,r0
(gdb) si
321 mtspr SPRN_MCSRR1,r0
(gdb) si
322 mfspr r1,SPRN_MCSR
(gdb) si
323 mtspr SPRN_MCSR,r1
(gdb) si
333 lis r1,0x0030 /* store gathering & broadcast disable */
(gdb) si
Cannot access memory at address 0x300000
(gdb) si
_start_440 () at start.S:334
334 ori r1,r1,0x6000 /* cache touch */
Cannot access memory at address 0xfffff03c
(gdb) bt
#0 _start_440 () at start.S:334
#1 0xfffff18c in rsttlb () at start.S:480
Backtrace stopped: frame did not save the PC
This is my first board bring up so any pointers you might have would be very helpful.
Thanks!
For some reason GDB only reads in the asm for the module being run. By stepping into other areas with the BDI I'm able to stepi from GDB without the "Cannot access memory" issues.
If you have questions feel free to send me a message.
Thx
This appears to be PowerPC code. My experience suggests that your memory address is not yet mapped. Start up code by default will access Non-Volatile Memory (NVM) code (ex: ROM, EEPROM, Flash...) and it is the responsibility to set or define where RAM is located. Generally, this information is pulled from NVM, and written into a Memory Management device or within the PowerPC chip in order to make the Processor aware of RAM. Without seeing the entire code it is difficult to assess if it is set up properly. The other possibility is that the config file of the BDI is not describing what is at address 0x300000

Setting up Interrupt Vector Table, ARMv6

I'm trying to use usermode and SVC in my ARMv6 bare metal application, but for this I need to set up the SVC entry of the ARMv6 interrupt vector table to branch to my interrupt handler. But, I can't find a good example on how to do this (ie: what memory address exactly I need to set, and to what). I have done similar things in the past, but always with a more comprehensive bootloader (RedBoot) that set up some of this for me. Any help would be appreciated.
I am testing my application using:
qemu-system-arm -M versatilepb -cpu arm1176
Are you talking about the SWI interrupt? Or one of the others (FIQ, IRQ). In either case I think I know what the problem is. Qemu is for running linux, your binary is not loaded at address 0x00000 so your entry points are not used by qemu for handling exceptions.
I have an example that uses qemu and implements a solution. Go to the qemu directory of http://github.com/dwelch67/yagbat. The qemu example is not really related to the gba thing in the yagbat repo, the gba is a 32 bit ARM thing so it was easy to borrow code from so I stuck it there.
The example was specifically written for your question as I tried to figure out how to use qemu in this manner. It appears that the address 0x00000000 space is simulated as ram, so you can re-write the qemu exception table and have the exceptions call code in the 0x10000 address space that your binary loads.
A quick and dirty solution is to make the entry point of the binary (that qemu loads to 0x10000) resemble a vector table at address 0x00000. The ldr pc instruction is relative to the program counter, the disassembly might show that it is loading an address at 0x10000 but it is really relative to the pc and the disassembler used the pc assuming the linked address being used.
.globl _start
_start:
ldr pc,start_vector_add
ldr pc,undef_vector_add
ldr pc,swi_vector_add
start_vector_add: .word start_vector
undef_vector_add: .word undef_vector
swi_vector_add: .word swi_vector
Then before you want to cause any interrupts, in the example I use the swi instruction to cause an swi interrupt. You copy enough of the code from 0x10000 to 0x00000 to include the exception table and the list of addresses that it loads into the pc. by linking your program to 0x10000 those addresses are in the 0x10000 range. When the interrupt occurs, the exception handler that you have now modified will load the 0x10000 based address into the pc and your handler in the 0x10000 range will get called.
Using this command line to run the binary in my example
qemu-system-arm -M versatilepb -m 128M -kernel hello_world.bin
and then ctrl-alt-3 (not F3 but 3) will switch to the serial console and you can see the output, and close that window to close out of qemu and stop the simulation.

core dump at _dl_sysinfo_int80 ()

I have created a TCP client that connects to a listening server.
We implemeted TCP keep alive also.
Some times the client crashes and core dumped.
Below are the core dump traces.
Problem is in linux kernel version Update 4, kernel 2.6.9-42.0.10.
we had two core dumps.
(gdb) where
#0 0x005e77a2 in _dl_sysinfo_int80 () from /ddisk/d303/dumps/mhx239131/ld-
linux.so.2
#1 0x006c8bd1 in connect () from /ddisk/d303/dumps/mhx239131/libc.so.6
#2 0x08057863 in connect_to_host ()
#3 0x08052f38 in open_ldap_connection ()
#4 0x0805690a in new_connection ()
#5 0x08052cc9 in ldap_open ()
#6 0x080522cf in checkHosts ()
#7 0x08049b36 in pollLDEs ()
#8 0x0804d1cd in doOnChange ()
#9 0x0804a642 in main ()
(gdb) where
#0 0x005e77a2 in _dl_sysinfo_int80 () from /ddisk/d303/dumps/mhx239131/ld-
linux.so.2
#1 0x0068ab60 in __nanosleep_nocancel (
from /ddisk/d303/dumps/mhx239131/libc.so.6
#2 0x080520a2 in Sleep ()
#3 0x08049ac1 in pollLDEs ()
#4 0x0804d1cd in doOnChange ()
#5 0x0804a642 in main ()
We have tried to reproduce the problem in our environment, but we could not.
What would cause the core file?
Please help me to avoid such situation.
Thanks,
Naga
_dl_sysinfo_int80 is just a function which does a system call into the kernel. So the core dump is happening on a system call (probably the one used by connect in the first example and nanosleep in the second example), probably because you are passing invalid pointers.
The invalid pointers could be because the code which calls these functions being broken or because somewhere else in the program is broken and corrupting the program's memory.
Take a look at two frames above (frame #2) in the core dump for both examples and check the parameters being passed. Unfortunately, it seems you did not compile with debug information, making it harder to see them.
Additionally, I would suggest trying valgrind and seeing if it finds something.
Your program almost cetainly did not coredump in either of the above places.
Most likely, you either have multiple threads in your process (and some other thread caused the core dump), or something external caused your process to die (such as 'kill -SIGABRT <pid>').
If you do have multiple threads, GDB 'info threads' and 'thread apply all where' are likely to provide further clues.

Resources