OpenACC-OpenMP support Arm Mali GPU - arm

I would like to ask if OpenACC or OpenMP support ARM Mali GPUs. I use OpenMP 4.0 which supports GPU parallelisation but I am not sure if it runs on the GPU. Do you have any idea how can I test it?

Neither are supported on Mali. Compute acceleration support is via OpenCL, or compute shaders in OpenGL ES / Vulkan.

Either/both specifications would work fine on Mali GPUs, but I'm not aware of any compilers that support offloading to Mali. GCC or CLANG would be your best bet, but I don't think either has a Mali target compiler.

The newly updated Arm C/C++ Compiler 21.1 with OpenMP 5.0 for Linux may support offloading to ARM MALI GPU targets.
OpenMP 5.0 features are supported by Arm C/C++ Compiler

Related

Will all ARM compilers produce the same Assembly code and run on various CPUs?

I have been developing code for an older device which has an NXP i.MX28 single core CPU which is ARM-based. The device runs Embedded Linux.
I am now upgrading to a better device which has an NXP i.MX6UL quad core processor, of course ARM-based also, and also running Embedded inux.
Is it normal that the same toolchain which I was using for the for building the code for the i.MX28 will also work for the i.MX6UL, even though the i.MX6UL is more advanced with more cores etc.?
I have built my code now for a test with the same compiler and even run it on a Rasberry Pi which seems to run ok. The Rasberry Pi uses a Broadcom BCM2711 SoC with an ARM Cortex-A72 processor which again is a different CPU.
I therefore must ask, will any ARM toolchain build code and be able to run on any type of ARM device regardless?
CPUs differ by the core architecture (incl. instruction set) and set of peripherals. Difference in the peripherals is solved by drivers and HALs. Difference in core arch is solved by the toolchain.
If the toolchain "knows" new arch it will emit the corresponding assembly code, that will run on the new CPU. So, compilers will not produdce the same assembly, but the same source code will run after rebuild, that's the idea of high-level languages.
Problems emerge when old code contains an inline assembly, or uses some specific DSP instructions or libraries

OpenCL, my GPU it's not capable?

I have an old computer, then I don't know if I can execute OpenCL codes on my PC; I've checked my GPU and I get this output:
When I execute OpenCL code, I get this error:
Finally, if I run clinfo, i get this:
I really don't know..It's a problem of libraries?Or my GPU cannot execute OpenCL codes?
Your GPU predates OpenCL. Beignet supports Ivybridge and later (https://www.freedesktop.org/wiki/Software/Beignet/#supportedtargets).
Your CPU also predates OpenCL. Intel's first release of their CPU-only OpenCL driver requires SSE4.1, but your CPU only has SSE3. If you really really need to get OpenCL to work on this machine, you may be able to install an old version (2.8) of the AMD OpenCL CPU driver if you can find it. Quote from http://boinc.berkeley.edu/wiki/OpenclCpu:
Intel's OpenCL support requires the SSE4.1 CPU feature (BOINC's event log shows you the list of your CPU's features).
If your host does not have SSE4.1 support, then you can install the AMD APP SDK 2.8 and it will install the AMD OpenCL CPU driver. Note that the AMD APP SDK v2.9 will NOT install it. You have to use 2.8 or earlier as they now bundle the OpenCL driver with the video driver instead of with the APP SDK. As AMD only keeps the last several versions on their archive page, you may want to grab both the 32 and 64 bit version of the v2.8 APP SDK now and keep them in a safe place.
Or maybe POCL or FreeOCL might cover you for the CPU.

MPI on ARM - baremetal platform?

Is MPI library supported by baremetal ARM system ? Does it work with ARM compilers? If yes, could anyone provide links/references, as I could not find it out .
Thanks
EDIT: I forgot to ask my main question. Is there any standard benchmark that uses MPI library and can be used on ARM CORTEX - M4 ? For instance LINPACK with MPI which benchmarks the Floating point unit.
that should be no problem.
the easiest way to try this out is to use OpenHPC
mpich, mvapich2 and Open MPI are provided.
an other option is to download the sources of your best MPI library and build the lib by yourself (fwiw, i am pretty sure this is your only option if you want to use modern Fortran with non GNU fortran compiler)
check out open MPI 2.1.1:
https://www.open-mpi.org/software/ompi/v2.1/
and here is how to build it:
https://developer.arm.com/products/software-development-tools/hpc/resources/porting-and-tuning/building-openmpi-with-arm-compiler
EDIT: i doubt that is benefical to use mpi on rtos/baremetal solution.
you can use uClinux for cortex m4 platform:
https://github.com/EmcraftSystems/linux-emcraft
Or you can try to porting mpi library to zephyr rtos: (a lot of work)
https://www.zephyrproject.org/

ARMv8 backward compatibility with ARMv7 (Snapdragon 820 vs Cortex-A15)

I see that ARMv8 is merely an extension of ARMv7 architecture and all code compiled on ARMv7 should run on ARMv8. I am interested in the backward compatibility of ARMv8 to ARMv7. Will code that was compiled on ARMv8 run on ARMv7?
I have a particular exact case of interest: I would like to run the comma.ai's Openpilot visiond binary which was compiled for the OnePlus 3 smartphone (Qualcomm MSM8996 Snapdragon 820 CPU) on the Nvidia Jetson TK1 (NVIDIA Cortex-A15 CPU). Will the visiond run on Jetson?
EDIT: There may be more in question than CPU compatibility since visiond probably heavily uses GPU on that phone. Will probably depend whether they use some standard parallelization ways (OpenCL, NEON etc.) or have some custom code for Snapdragons GPU. Even with OpenCL the chance of compatibility is probably quite low on different HW.
I believe that aarch32 userland is fully or very highly backwards compatible with ARMv7, i.e. userland programs compiled for ARMv7 should just work in AArch32, but I couldn't find a precise quote in the ARM manual.
aarch32 does have new instructions added over ARMv7 however, most of them seem to be functionality that ARMv8 added and the designers decided to expose on aarch32. Therefore, aarch32 is not forward compatible with ARMv7, i.e., programs compiled for aarch32 might not run on ARMv7.
I'm not sure about system land. See also: Does ARMv8 AArch32 mode has backward compatible with armv4 , armv5 or armv6?

Software simulation from ARM Cortex-M0

Is there a software simulator for ARM Cortex-M0 ?
I have a thumb only (not thumb2) instruction set simulator, goto github and search for thumbulator. Depends on what you are trying to do, could compile for thumb for a while then switch to thumb2 later.
For arm I found a behavioral verilog model out on a university site.
For thumb2 you might check and see if qemu supports it, I know there is support for the stellaris cortex-m3 so that may put you close enough.
There is no FOSS simulator. ARM documentation license prohibit documentation use for making simulator. You have to pay money to ARM to use documentation for simulation purposes and so all ARM simulators for latest architectures are non free.
You can download & use the free version of Keil uVision (limited to 32k)
IAR Embedded Workbench (www.iar.se) includes a simulator for Cortex cores. It is free (kickstarter version) up to 32kb of code size.

Resources