I'm using IAR Embedded Workbench for ARM (ARM7TDMI-S) and the majority of my work is done using little-endian format. However, I saw in the manual that I can do something like :
__big_endian int i, j;
to declare those two variables as big endian (while the rest of the app as little endian). This seems like a fantastic feature, but when I try to compile, I always get the errror:
Error[Pa002]: the type attribute "__big_endian" is not allowed on this declaration.
The big endian line above is copied directly from the manual, but it does not work. This is a great feature of the compiler and would make life a big easier. Any ideas how to get it working?
I have my language conformance set to 'Allow IAR extensions' on the C/C++ Compiler options tab on the IDE options.
From IAR's docs:
The __big_endian keyword is available when you compile for ARMv6 or higher.
ARMv6 added the SETEND instruction which manipulates a state bit to configure which endianess the processor will use when performing a load/store operation. Looks like IAR's __big_endian intrinsic just causes the processor to manipulate that bit when accessing the variable tagged with that attribute.
The ARM7TDMI is an ARMv4 (or maybe ARMv5) architecture device (if I recall correctly).
This is an extension feature in the IAR compiler and so has to be enabled either by using the -e option of the command line or by enabling the IAR extensions in the compiler options page of the IDE. This keyword is not compatible with the --strict_ansi compiler option.
Related
I am trying to learn function call grammar in arm architecture and i compiled same code for user mode app and loadable kernel module. in attached picture you can see disassembly result for same function in two different mode. i am curious about reason of this difference.
You have compiled the code with wildly different options. The first is ARM (32bit only) and the 2nd is Thumb2 (mixed 16/32bit); see hex opcodes at the side. Thumb2 used the first 8 registers in a compact way (16bit encodings) so the call interface is different. Ie, fp is r7 versus r12. This is why you are seeing different call sequences for the same code.
Also, the first has profiling enabled (why __gnu_mcount_nc is inserted).
It really has nothing to do with 'kernel' versus 'user' code. It is possible to compile user code with similar option as the kernel uses. There are many gcc command line options which affect the 'call interface' (search AAPCS for more information and the gcc ARM options help).
Related: ARM Link and frame pointer
In one of my applications, I need to efficiently de-interleave bits in a long stream of data. Ideally, I would like to use the BMI2 pext_u32() and/or pext_u64() x86_64 intrinsic instructions when available. I scoured the internet for doc on x86intrin.h (GCC), but couldn't find much on the subject; so, I am asking the gurus on StackOverflow to help me out.
Where can I find documentation about how to work with functions in x86intrin.h?
Does gcc's implementation of pext_*() already have code behind it to fall back on, or do I need to write the fallback code myself (for conditional compile)?
Is it possible to write a binary that automatically falls back to an alternate implementation if a target does not support the intrinsic? If so, how does one do so?
Is there a known programming pattern that will be recognized by GCC and automatically converted to pext_*() when compiling with optimization enabled and with -mbmi2?
Intel publishes the Intrinsics Guide, which also applies to GCC. You will have to write your own fallback code if you use these intrinsics.
You can achieve automatic switching of implementations by using IFUNC resolvers, but for non-library code, using conditionals or function pointers is probably simpler.
Looking at the gcc/config/i386/i386.md and gcc/config/i386/i386.c files, I don't see anything in GCC 8 which would automatically select the pext instruction without intrinsics in the source code.
The design philosophy of Intel's intrinsics is that you can only use them in functions that will run only on CPUs with the required extensions. Checking for support every instruction would add way too much overhead, and then there's have to be a fallback (there isn't).
Intel intrinsics are not like GNU C __builtin_popcountll (which does use a fallback if compiled without -mpopcnt, but not you can enable target options on a per-function basis with attributes.)
I was searching for this in the mplab compiler users guide but haven't found anything. I am asking it here to confirm that I am not blind or anything:
The GCC compiler provides some very interesting and useful built-in functions like __builtin_constant_p(x) or similar stuff. I have never found anything like that in the microchip compilers and I don't think there is.
So the question: Do Microchip XCxx Compilers provide any non-standard built-in functions apart from the device specific ones (like declaring variables at a given register address or declaring an interrupt function)?
EDIT: To clarify some more: I am mostly interested in retrieving information from the compiler. A good example would be something like builtin_constant, as it makes information available to the program which is normally not usable. But I do not limit this question to find constant expressions only.
XC16 manual in google and out rolls: http://ww1.microchip.com/downloads/en/DeviceDoc/50002071E.pdf appendix G.
The same document mentioned by #Marco van de Voort has a list of pre-defined macros in section 19.4 that give you information about the compiler environment and the device.
There is also the somewhat undocumented __DEBUG macro which is defined when running under MPLABX in debug mode (MPLABX defines this in the call to the compiler).
These are the builtins supported by the XC16 compiler
e.g. __builtin_add
For a complete description of the builtins see the MPLAB XC16 compiler user's manual (under "docs" folder of compiler installation) or here: http://www.microchip.com/mymicrochip/filehandler.aspx?ddocname=en559023
I'm writing a program using Intel intrinsics. I want to use _mm_permute_pd intrinsic, which is only available on CPUs with AVX. For CPUs without AVX I can use _mm_shuffle_pd but according to the specs it is much slower than _mm_permute_pd. Do the header files for Intel intrinsics define constants that allow me to distinguish whether AVX is supported so that I can write sth like this:
#ifdef __IS_AVX_SUPPORTED__ // is there sth like this defined?
// use _mm_permute_pd
# else
// use _mm_shuffle_pd
#endif
? I have found this tutorial, which shows how to perform a runtime check but I need to do a static, compile-time check for the current machine.
GCC, ICC, MSVC, and Clang all define a macro __AVX__ which you can check. In fact it's the only SIMD constant defined by all those compilers (MSVC is the one that breaks the mold). This only tells you if your code was compiled with AVX support (e.g. -mavx with GCC or /arch:AVX with MSVC) it does not tell you if your CPU supports AVX. If you want to know if the CPU supports AVX you need to check CPUID. Here, asm-in-c-error, is an example to read CPUID from all those compilers.
To do this properly I suggest you make a CPU dispatcher.
Edit: In case anyone wants to know how to use the values from CPUID to find out if AVX is available see https://github.com/Mysticial/FeatureDetector
I assume you are using Intel C++ Compiler. In this case - yes, there are such macros: Intel C++ Compiler Reference Guide: __AVX__, __AVX2__.
P.S. Be aware that if you compile you application with AVX instruction set enabled it will fail on CPUs not supporting AVX. If you are going to distribute your software as source code package and compile on target machine - this is may be a viable solution. Otherwise you should check for AVX dynamically.
P.P.S. There are several options for ICC. Take a look at the following compiler options and also references from it to other.
It seems to me that the only way is to compile and run a program that identifies whether AVX is available. Then manually or automatically compile separate code with or without AVX functions. For VS 2013, I would used my code in commomAVX folder in the following to identify hasAVX (or not) and use this to execute one of two different BAT files to compile and link the appropriate program.
http://www.roylongbottom.org.uk/gigaflops-benchmarks.zip
My question was to help to identify a solution regarding the use of suitable compile options such as /arch:AVX.
I am compiling on a 64 bit architecture with the intel C compiler. The same code built fine on a different 64 bit intel architecture.
Now when I try to build the binaries, I get a message "Skipping incompatible ../../libtime.a" or some such thing, that is indicating the libtime.a that I archived (from some object files I compiled) is not compatible. I googled and it seemed like this was usually the result of a 32->64 bit changeover or something like that, but the intel C compiler doesnt seem to support a -64 or some other memory option at compile time. How do I troubleshoot and fix this error?
You cannot mix 64-bit and 32-bit compiled code. Config instructions for Linux are here.
You need to determine the target processor of both the library and the new code you are building. This can be done in a few ways but the easiest is:
$ objdump -f ../../libtime.a otherfile.o
For libtime this will probably print out bunches of things, but they should all have the same target processor. Make sure that otherfile.o (which you should substitute one of your object files for) also has the same architecture.
gcc has the -m32 and -m64 flags for switching from the default target to a similar processor with the different register and memory width (commonly x86 and x86_64), which the Intel C compiler may also have.
If this has not been helpful then you should include the commands (with all flags) used to compile everything and also information about the systems that each command was being run on.