How to check the existence of NEON on arm? - arm

How to determine whether NEON engine exists on given ARM processor? Any status/flag register can be queried for such purpose?

I believe unixsmurf's answer is about as good as you'll get if using an OS with privileged kernel. For general purpose feature detection, it seems ARM has made it a requirement to get this from the OS, and so you must use an OS API to get it.
On Android NDK use #include <cpu-features.h> with (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM) && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON). Note this is for 32 bit ARM. ARM 64 bit has different flags but the idea is the same. See the sources/docs.
On Linux, if available use #include <sys/auxv.h> and #include <asm/hwcap.h> with getauxval(AT_HWCAP) & HWCAP_NEON.
On iOS, I'm not sure there is a dynamic call, the methodology seems to be that you build your app targeting NEON, then make sure your app is flagged to require NEON so it will only install on devices which support it. Of course you should use the pre-defined preprocessor flag __ARM_NEON__ to make sure everything is in order at compile time.
On whatever Microsoft does or if you are using some other RTOS... I don't know...
Actually you'll see a lot of Android implementations which just parse /proc/cpuinfo in order to implement android_getCpuFeatures().... Heh. But still it seems to be getting improved and newest versions use the getauxval method.

One reliable way is to check the architectural feature trap register. For example, on ARM Cortex A35, you can check the value of HCPTR register to see whether NEON is implemented (0x000033FF), or not (0x0000BFFF). The register name and indication value are platform dependent, making sure to check the technical reference manual.

Is there anyway to check if Neon and sve is supported?
I have seen someone saying something about the HCPTR register, but it does not seem to have any relationship to neon and besides looks to be a Aarch32 instruction according to the docs
https://developer.arm.com/docs/ddi0595/g/aarch32-system-registers/hcptr

Related

Is it possible to build C source code written for ARM to run on x86 platform?

I got some source code in plain C. It is built to run on ARM with a cross-compiler on Windows.
Now I want to do some white-box unit testing of the code. And I don't want to run the test on an ARM board because it may not be very efficient.
Since the C source code is instruction set independent, and I just want to verify the software logic at the C-level, I am wondering if it is possible to build the C source code to run on x86. It makes debugging and inspection much easier.
Or is there some proper way to do white-box testing of C code written for ARM?
Thanks!
BTW, I have read the thread: How does native android code written for ARM run on x86?
It seems not to be what I need.
ADD 1 - 10:42 PM 7/18/2021
The physical ARM hardware that the code targets may not be ready yet. So I want to verify the software logic at a very early phase. Based on John Bollinger's answer, I am thinking about another option: Just build the binary as usual for ARM. Then use QEMU to find a compatible ARM cpu to run the code. The code is assured not to touch any special hardware IO. So a compatible cpu should be enough to run all the code I think. If this is possible, I think I need to find a way to let QEMU load my binary on a piece of emulated bare-metal. And to get some output, I need to at least write a serial port driver to bridge my binary to the serial port.
ADD 2 - 8:55 AM 7/19/2021
Some more background, the C code is targeting ARMv8 ISA. And the code manipulates some hardware IPs which are not ready yet. I am planning to create a software HAL for those IPs and verify the C code over the HAL. If the HAL is good enough, everything can be purely software and I guess the only missing part is a ARMv8 compatible CPU, which I believe QEMU can provide.
ADD 3 - 11:30 PM 7/19/2021
Just found this link. It seems QEMU user mode emulation can be leveraged to run ARM binaries directly on a x86 Linux. Will try it and get back later.
ADD 4 - 11:42 AM 7/29/2021
An some useful links:
Override a function call in C
__attribute__((weak)) and static libraries
What are weak functions and what are their uses? I am using a stm32f429 micro controller
Why the weak symbol defined in the same .a file but different .o file is not used as fall back?
Now I want to do some white-box unit testing of the code. And I don't want to run the test on an ARM board because it may not be very efficient.
What does efficiency have to do with it if you cannot be sure that your test results are representative of the real target platform?
Since the C source code is instruction set independent,
C programs vary widely in how portable they are. This tends to be less related to CPU instruction set than to target machine and implementation details such as data type sizes, word endianness, memory size, and floating-point implementation, and implementation-defined and undefined program behaviors.
It is not at all safe to assume that just because the program is written in C, that it can be successfully built for a different target machine than it was developed for, or that if it is built for a different target, that its behavior there is the same.
I am wondering if it is possible to build the C source code to run on x86. It makes debugging and inspection much easier.
It is probably possible to build the program. There are several good C compilers for various x86 and x86_64 platforms, and if your C code conforms to one of the language specifications then those compilers should accept it. Whether the behavior of the result is representative of the behavior on ARM is a different question, however (see above).
It may nevertheless be a worthwhile exercise to port the program to another platform, such as x86 or x86_64 Windows. Such an exercise would be likely to unmask some bugs. But this would be a project in its own right, and I doubt that it would be worth the effort if there is no intention to run the program on the new platform other than for testing purposes.
Or is there some proper way to do white-box testing of C code written for ARM?
I don't know what proper means to you, but there is no substitute for testing on the target hardware that you ultimately want to support. You might find it useful to perform initial testing on emulated hardware, however, instead of on a physical ARM device.
If you were writing ARM code for a windows desktop application there would be no difference for the most part and the code would just compile and run. My guess is you are developing for some device that does some specific task.
I do this for a lot of my embedded ARM code. Typically the core algorithms work just fine when built on x86 but the whole application does not. The problems come in with the hardware other than the CPU. For example I might be using a LCD display, some sensors, and Free RTOS on the ARM project but the code that runs on Windows does not have any of these. What I do is extract important pieces of C/C++ code and write a test framework around it. In the real ARM code the device is reading values from a sensor and doing something with it. In the test code that runs on a desktop the code reads from a data file with fake sensor values and writes its output to a datafile that can be analyzed. This way I can have white box tests for the most complicated code.
May I ask, roughly what does this code do? An ARM processor with no peripherals would be kind of useless. Typically we use the processor to interact with some other hardware like a screen, some buttons, or Bluetooth. It's those interactions that are going to be the most problematic.

check CPU model to execute a specific C code [duplicate]

This question already has an answer here:
How to tell if program is running on x86/x64 or ARM Linux platforms
(1 answer)
Closed 4 years ago.
I want to create a C code that somehow contains two separated blocks. I want to use a function or a tool that extracts the CPU model, and based on that, the program decides which block of code it executes. I only have the idea and I don't know how to implement it !
The first block of code will be executed on an Intel i7 and the second should be executed on ARM Cortex A53.
PS : I am a beginner and I have nothing to do with hardware and similar stuff. Thank you for your help :)
As clearly pointed out, first off you cant have a C program that runs to a point to determine ARM from x86 as that code has to already be ARM or x86. These are different instruction sets. You can use say python or JAVA or some other scripty/virtual machine language. But then you have a COMPILE time decision to build for one target or the other, at that point you already know which target as you are actually running code on it, so if this is strictly ARM vs X86 there is no reason to check runtime. Thats not to say that each architecture and/or system will have a way to check the architecture and flavor you are on ARMv6 vs ARMv7, for example, but not necessarily ARMv7 32 bit vs ARMv8 64 bit although you technically can run aarch32 and aarch64 instruction sets on most ARMv8s just not intermixed, have to have the os or execution level changes yourself to switch between them.
You do understand there are different incompatible instruction sets, specifically the ones you described and C code is compiled to one or the other. So you cannot have a program in C compiled for a target that can detect the other target. You have already selected the target before you get to this point. Now there are emulators, but they tend to target one architecture as well. There are/were products from specific vendors that would emulate one instruction set and convert it runtime to the other, over time as you re-run that code it continues to convert it. You could try that, but you still have to be running code for the right target on the right logic/emulator, and then have a now special detection that is not the norm to find the true underlaying architecture, not the faked emulator.
I suspect you are thinking you can have one architecture specific module that detects the architecture to run architecture specific code. This does not work with C in general, does not make sense to try, thus there probably isnt a good tool for this. In particular since the solution for such a thing is either you build this into the binary file format and the operating system picks because it knows, or you wrap your binary with a target independent language like Python or JAVA or scripty language like perl, bash, etc. that can independent of target determine the architecture (in that case solutions vary widely specific to operating system and language for starters) and then choose which binary to run.
There are many ways to achieve what you want. To check which model is present you first have to read which model you have. how to do that varies between Windows and Linux. i found this SO-topic helpful and it might also be a good start for your research: How to check CPU name, model, speed on Windows/Linux C?

How to check msr.le at runtime using built-ins?

This question came up in a Power8 in-core crypto patch. The patch provides AES using Power8 built-ins. When loading a VSX register we need to perform a 128-bit endian reversal when running on a little-endian machine to ensure the VSX register loads the proper value.
At compile time we can check macros like __BYTE_ORDER__. However, I believe we are supposed to check the machine status register at runtime. If msr.le=1, then we perform the endian swap. Also see the AltiVec Programming Environment Manual, Section 3.1.4, p. 3-5.
How do we check the machine status register at runtime using built-ins?
You don't need to - it's known at compile time. Your instructions will be encoded completely incorrectly if you're running in the opposite endianness of your compiled code. So, your OS will ensure that your program is running in the correct MSR[LE] setting for the endianness of the executable.
In essence: the MSR[LE] bit controls instructions as well as data loads/stores.
There are some tricks we can use to detect endianness if we really have no idea, but unless you're writing super early boot code, you won't need that.

Is there a cross-architecture way to detect CPU features on linux?

Is there any way to determine CPU features in a cross-architecture (i.e. works similarly on ARM, x86, etc.) way on Linux? /proc/cpuinfo would fit as a solution, but it seems that it isn't meant to be parsed as there is a number of inconsistencies. For example, the field I'm interested in is named flags on x86, but Features on ARM and so on. Is there another standard way to get the equivalent information?
Have you tried lscpu --parse ?
You may use open source Yeppp! library (disclosure: I am the author) which provides information on CPU features on x86, ARM, MIPS, and PowerPC.
This example hints how to retrieve this information with Yeppp!

Profiling on baremetal embedded systems (ARM)

I am wondering how you profile software on bare metal systems (ARM Cortex a8)? Previously I was using a simulator which had built-in benchmark statistics, and now I want to compare results from real hardware (running on a BeagleBoard-Xm).
I understand that you can use gprof, however I'm kind of lost as that assumes you have to run Linux on the target system?
I build the executable file with Codesourcery's arm-none-eabi cross-compiler and the target system is running FreeRTOS.
Closely evaluate what you mean by "profiling". You are indeed operating very close to bare metal, and it's likely that you will be required to take on some of the work performed by a tool like gprof.
Do you want to time a function call? or an ISR? How about toggling a GPIO line upon entering and exiting the code under inspection. A data logger or oscilloscope can be set to trigger on these events. (In my experience, a data logger is more convenient since mine could be configured to capture a sequence of these events - allowing me to compute average timings.)
Do you want to count the number of executions? The Cortex A8 comes equipped with a number of features (like configurable event counters) that can assist: link. Your ARM chip may be equipped with other peripherals that could be used, as well (depending on the vendor). Regardless, take a look at the above link - the new ARMs have lots of cool features that I don't get to play with as much as I would like! ;-)
I have managed to get profiling working for ARM Cortex M. As the GNU ARM Embedded (launchpad) tools do not come with profiling libraries included, I have added the necessary glue and profiling functionality.
References:
See http://mcuoneclipse.com/2015/08/23/tutorial-using-gnu-profiling-gprof-with-arm-cortex-m/
I hope this helps.

Resources