I've got a DE10-Nano Cyclone V development board with 1 GB of external DDR3 RAM from Terasic and I want to implement a driver, which can manage the communication between Linux running on the ARM Cortex-A9 processor and the FPGA fabric of the Cyclone V.
With dma_alloc_coherent i allocate a certain amount of memory and write the hardware address to the FPGA module i programmed.
I then proceed to write an arbitrary number through the SDRAM AXI interface to the given address, but apparently neither the AWREADY, nor the WREADY signal ever get asserted by the SDRAM AXI-slave.
I've configured the SDRAM AXI Interface to run at 325 MHz, be 256 bit wide (datalength), have a 32 bit addressing length and to be an AXI3 slave. The SDRAM Interface is configured as TrustZone-aware device (ARM TrustZone setting)
I've also hardwired some other configuration lines to the AXI slave, which i'll be listing now:
assign axm_m0_arburst = 'd0;
assign axm_m0_arcache = 'd0;
assign axm_m0_arid = 'd0;
assign axm_m0_arlen = 'd0;
assign axm_m0_arlock = 'd0;
assign axm_m0_arprot = 'd0;
assign axm_m0_arsize = 'b101;
assign axm_m0_awburst = 'd0;
assign axm_m0_awcache = 'd0;
assign axm_m0_awid = 'd0;
assign axm_m0_awlen = 'd0;
assign axm_m0_awlock = 'd0;
assign axm_m0_awprot = 'd0;
assign axm_m0_awsize = 'b101;
assign axm_m0_wid = 'd0;
assign axm_m0_wstrb = 'hFFFFFFFF;
When looking at the FPGA bridge driver in Linux (/sys/class/fpga-bridge/br4) the state is shown to be 'enabled'.
What could be a reason for the bridge to still block communication?
Thanks for any help.
Problem solved:
Apparently even though Linux said the bridges are enabled, they weren't.
One has to write certain configuration bits into the configuration fabric of the HPS, otherwise the module wont work.
1. Generate a handoff folder with Quartus assemble step. This generates the configuration description for all bridges, among other stuff.
2. With bsp-editor generate and compile the first stage bootloader, which will have some global variables, that store the configuration values and make them available to bootscripts.
3. Generate a bootscript with this content:
echo -- Programming FPGA --
fatload mmc 0:1 $fpgadata soc_system.rbf;
fpga load 0 $fpgadata $filesize;
run bridge_enable_handoff;
mw $fpgaintf $fpgaintf_handoff;
mw $fpga2sdram $fpga2sdram_handoff;
go $fpga2sdram_apply;
mw $axibridge $axibridge_handoff;
mw $l3remap $l3remap_handoff;
#md $fpgaintf;
#md $fpga2sdram;
#md $axibridge;
setenv fdtimage soc_system.dtb;
setenv mmcroot /dev/mmcblk0p2;
setenv mmcload 'mmc rescan;${mmcloadcmd} mmc 0:${mmcloadpart} ${loadaddr} ${bootimage};${mmcloadcmd} mmc 0:${mmcloadpart} ${fdtaddr} ${fdtimage};';
setenv mmcboot 'setenv bootargs console=ttyS0,115200 root=${mmcroot} rw rootwait; bootz ${loadaddr} - ${fdtaddr}';
run mmcload;
run mmcboot;
The lines, that are commented, result in a crash, because apparently the data is unaligned, and these kind of accesses aren't allowed with the processor. I'll do further investigation on this issue.
For further reading on these topics I recommend these pages:
Cyclone V HPS Memory Map (Altera)
Tutorial on FPGA soft programming (Rocketboards)
How to enable HPS bridges (Altera)
Related
I was creating some drivers and I found my self stuck in the IRQ Pins, my kernel uses IOAPIC and I don't know how this interrupt mechanism (IRQ Pins) works and how to get them and use them.
Can anyone give a detailled answer on how to use them to make interrupts work.
A PCI device can potentially use up to 4 interrupt pins INTA#, INTB#, INTC# and INTD#. These signals will be wired to the interrupt controller. They are level-sensitive and may be shared by other PCI devices. If it has a single PCI function it will typically use INTA# if it uses one at all. If it has multiple PCI functions, the different PCI functions (up to 8) may use different interrupt pins or share the same one.
The read-only "Interrupt Pin" register at offset 3Dh in the PCI function's Type 00h configuration header says which interrupt pin the PCI function is using: 0 = none, 1 = INTA#, 2 = INTB#, 3 = INTC#, 4 = INTD#.
The read-write "Interrupt Line" register at offset 3Ch in the Type 00h configuration defines which IRQ number has been assigned to the PCI function by the system firmware (BIOS) or operating system. This IRQ number may be shared by other devices in the system.
Drivers don't usually care much about the "Interrupt Pin" register. They are more interested in the "Interrupt Line" register value set up by the firmware or operating system. Operating systems usually provide this information in a more friendly way than the driver having to retrieve the information directly from the PCI configuration memory.
I'm working on an Altera Cyclone V SoC. I'm attempting to write directly to FPGA peripherals from my SoC, however, the hwlib library only contains the function alt_write_word, which I understand that this function writes to the cache first before writing it to the main memory. In NIOS II, the built in function IOWR has already configured the memory so that the IOWR function writes directly to the FPGA peripherals. So, my question is, when I'm working with SoC, if the hwlib library doesn't provide such a function, how can I write directly to the FPGA peripherals ? Do I need to configure the memory type or what ?
I assume that your application If your application is bare-metal and thus you are not using the MMU (i.e. virtual addresses), you can try to reach the peripherals directly by pointing them with a volatile qualified pointer. Here is a little example:
volatile uint32_t* registerAddress = PERIPHERAL_BASE_ADDR;
const uint32_t registerValue = *registerAddress;
It is also possible to write onto them if the hardware permits it.
*registerAddress = 0xDEADBEEF;
The addresses of peripherals should be excluded from the cacheable range. If you are already using an SoC with a dedicated bus architecture, then you don't need to worry about it. Otherwise, you might need to adjust your cacheable range when designing the system.
I'm working on implementing a simple PCI device in QEMU and a kernel driver for it, and I have some trouble with handling pci_read/write_config_* function calls from the device side.
Unlike simple rw operations on a memory mapped bar, where the MemoryRegionOps callbacks receive the exact offset used by the driver, the config_read/write callbacks implemented as members in PCIDevice struct, receive an address that went through some manipulations/mapping that I have a hard time understanding.
Following the code path up to pci_config_host_read/write in QEMU sources, and the same in the kernel side for pci_read/write_config_* functions, didn't provide any clear answers.
Can anyone help me understand how to extract the config offset used by the driver when calling the pci config rw functions?
If you set your PCI device model up to implement the QEMU PCIDevice config_read and config_write methods, the addresses passed to them should be the offsets into the PCI config space (ie starting with the standard 0 == PCI_VENDOR_ID, 2 == PCI_DEVICE_ID, 4 == PCI_COMMAND and so on, and any device-specific stuff after the 64 bytes of standardized config space).
I'm trying to write ethernet driver for Linux kernel 4.13.x for Banana Pi M2 ultra.
Some time ago so called "device tree" (DT) has been introduced in Linux kernel infrastructure.
I have no much experience with using DT while writing device drivers and because of that
I've got a few questions.
As far as I know - in case of banana pi system - it is needed to provide some clock source for given
peripheral device. It is function of CCU in banana pi to provide such a clock. The CCU is memory mapped
resource available at some address in the linux kernel. I'd like to write driver for ethernet which
needs some clock from CCU.
I know that physical address of CCU must be mapped via ioremap() or similar function to virtual address.
My question is how can I fetch the virtual address of CCU in my ethernet driver? Is it possible to do via
device tree? If yes - how to do this? Or maybe this virtual address can be get another way?
I'm just not sure if it is done (fetching virt address) via DT or just by some procedure or via global pointer.
Any ideas or suggestions?
There are examples in Linux kernel for platform drivers. I have worked on i2c and i2s on raspberry pi so I can quote those examples.
In http://elixir.free-electrons.com/linux/v4.3.2/source/drivers/i2c/busses/i2c-bcm2835.c
Look at the probe function, it calls the subsystem api
platform_get_resource(pdev, IORESOURCE_MEM, 0);
This can gives physical address which is ioremap ..
For this it is required to create a device node in device tree as in
https://github.com/raspberrypi/linux/blob/rpi-4.9.y/arch/arm/boot/dts/bcm283x.dtsi
check for i2c0 device node in file bcm283x.dtsi.
The reg key is where physical address is stored
reg = <0x7e205000 0x1000>;
physical add size
Hope this help you.
Device tree may be considered analogus to platform data previously
I've got a pci device driver that currently uses dma_map_page to map a userspace address to a dma address. This is working fine but I'm trying to port this to the iommu api to get some of the benefits using groups and domains give.
Current Code: This works fine
ret = get_user_pages_fast(user_addr, one_page, flags, page);
dma_addr = dma_map_page(dev, off, size, *page, DMA_BIDIRECTIONAL);
IOMMU Code: This doesn't work
ret = get_user_pages_fast(...);
pfn = page_to_pfn(*page);
group = iommu_group_get(dev);
domain = iommu_domain_alloc(dev->bus);
iommu_attach_device(domain, dev);
iommu_attach_group(domain, group);
iommu_map(domain, iova, pfn << PAGE_SHIFT, size, IOMMU_READ|IOMMU_WRITE);
All functions return successfully, but when I pass the iova to the device the device can't use it. Has anyone worked with the iommu before and know where my problem is or where I can look? I haven't been able to find much on Linux's iommu implementation anywhere.
Edit:
There were some entries in dmesg that I missed the first time around:
DEBUG: phys addr 0x7738de000
DEBUG: iova 0xdeadb000
DMAR: DRHD: handling fault status reg 2
DMAR: DMAR:[DMA Read] Request device [50:00.0] fault addr 1fdaee4000
DMAR:[fault reason 06] PTE Read access is not set
Such operations are privileged because it is accessing page tables or may be data structures maintained inside task structure.
Please check how hypervisor do this stuff or Virtual machines handles such calls. There may be some driver interface which sets the IOMMU paging unit from Guest OS via hypervisor.
Hypervisor also executes in privileged mode.