Trying to port GCC specific asm goto to Clang - c

I've been trying to turn a bit of GNU extensions in to actual standard C so it'll run on clang, knowing standard C and not GNU extensions, I'm at a bit of a loss.
__asm__ (goto("1:"
STATIC_KEY_INITIAL_NOP
".pushsection __jump_table, \"aw\" \n\t"
_ASM_ALIGN "\n\t"
_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
".popsection \n\t"
: : "i" (key) : : l_yes););
I've tried to turn this in to actual asm, but have yet to be successful.
If you're curious, this is part of a kernel I've just about got to build on clang, besides that one section.

You seem to be having a problem compiling arch/x86/include/asm/jump_label.h. The entire code-snippet is to enable support for "jump label patching". A new feature quite useful to allow debugging(print logs etc.) with have near-zero overhead when debugging is disabled.
The implementation you encounter depends on gcc(v4.5) which adds a new asm goto statement that allows branching to a label.
It appears that Clang/LLVM < 9.0.0 does NOT support asm goto.
As a quick fix to get your Linux kernel compiling properly, you can disable CONFIG_JUMP_LABEL in your kernel configuration. This config option is used to disable the optimisation when the compiler does NOT support asm goto properly.
Update: Initial support for asm goto was added to Clang in v9.0.0.
Initial support for asm goto statements (a GNU C extension) has been
added for control flow from inline assembly to labels. The main consumers of
this construct are the Linux kernel (CONFIG_JUMP_LABEL=y) and glib. There are
still a few unsupported corner cases in Clang's integrated assembler and
IfConverter. Please file bugs for any issues you run into.

Related

Extended asm with goto, including an example from the gcc docs, fails to compile

Some extended assembly statements using the goto qualifier fail to compile with GCC 10.1.0. Specifically,
int foo(int count)
{
asm goto ("dec %0; jb %l[stop]"
: "+r" (count)
:
:
: stop);
return count;
stop:
return 0;
}
(which is an example in the GCC extended asm docs) fails to compile with the message expected ‘:’ before string constant. Removing the "+r" (count) and the dec %0 allows it to compile successfully, but regardless of what I try whenever an output operand is supplied in the same asm statement as a goto label, it errors in this same way.
It appears the current development GCC documentation which you are referencing applies to the latest trunk branches of GCC and doesn't apply to any of the official releases of GCC. Official releases of GCC do not as of this time support asm goto with any output or input/output constraints. You can see this on godbolt. Latest trunk works but 10.2 and 10.1 don't. The fix is to wait for the next major release of GCC (version 11.x); download and compile the latest trunk release; modify your inline assembly so that it doesn't rely on any output or output/input constraints.
Up until recently the documentation for GCC up to version 10.x had this to say:
An asm goto statement cannot have outputs. This is due to an internal
restriction of the compiler: control transfer instructions cannot have
outputs. If the assembler code does modify anything, use the "memory"
clobber to force the optimizers to flush all register values to memory
and reload them if necessary after the asm statement.
A list of all the documentation for the official releases and the current development documentation can be found at this URL. The current development documentation is at the bottom of the page. Rule of thumb is that you should consult the documentation for your specific version of GCC. I believe that all 10.x release documentation is the same as the latest 10.x version of the documentation on the GCC webpage.
Recent versions of CLANG/LLVM (11.0+) do support this feature but that is relatively recent addition as well.
asm goto doesn't allow output operands.
It is a gnu decision. in the function c_parser_for_statement from c-parser.c you can find :
/* For asm goto, we don't allow output operands, but reserve
the slot for a future extension that does allow them. */
https://github.com/gcc-mirror/gcc/blob/releases/gcc-10/gcc/c/c-parser.c
However may be this situation will change since in the master branch this comment is not present anymore.

Is `__asm nop` the Windows equivalent of `asm volatile("nop");` from GCC compiler

In Windows, can __asm nop be swapped for asm volatile("nop"); (used in GCC compiler) and yield the same result?
I have read that volatile() (in GCC) guarantees the call will not be optimized away. However, it doesn't port directly to Windows, and I was curious if it can simply be removed or if it needs to be replaced with a similar construct.
The __asm keyword implementation is quite simplistic in MSVC. It always emits the machine code unaltered and the optimizer doesn't touch it. Nor does it make any assumptions about machine state after the __asm, that has a knack for defeating other optimizations.
So, no, nothing similar to volatile() is required, it can't disappear. Plain __asm { nop } will always survive unscathed and is equivalent to the GCC assembly.
Do keep in mind that inline assembly is not a good long-term strategy, support for it was removed completely in the x64 compiler and is pretty unlikely to ever come back. You'll have to fall back to intrinsics or link code written in assembly and compiled with, say, ml64.exe. That does defeat NOP injection, but code alignment is already well taken care of by the optimizer and doesn't need help. Also the reason you probably should not do this at all.
For the Microsoft compiler, use the __nop() intrinsic function to emit a nop instruction without handicapping the compiler's optimizer. This would also be cross-platform across all Windows targets (32 bit ARM V7, 64 bit ARM V8, IA32, X64).

How to check with Intel intrinsics if AVX extensions is supported by the CPU?

I'm writing a program using Intel intrinsics. I want to use _mm_permute_pd intrinsic, which is only available on CPUs with AVX. For CPUs without AVX I can use _mm_shuffle_pd but according to the specs it is much slower than _mm_permute_pd. Do the header files for Intel intrinsics define constants that allow me to distinguish whether AVX is supported so that I can write sth like this:
#ifdef __IS_AVX_SUPPORTED__ // is there sth like this defined?
// use _mm_permute_pd
# else
// use _mm_shuffle_pd
#endif
? I have found this tutorial, which shows how to perform a runtime check but I need to do a static, compile-time check for the current machine.
GCC, ICC, MSVC, and Clang all define a macro __AVX__ which you can check. In fact it's the only SIMD constant defined by all those compilers (MSVC is the one that breaks the mold). This only tells you if your code was compiled with AVX support (e.g. -mavx with GCC or /arch:AVX with MSVC) it does not tell you if your CPU supports AVX. If you want to know if the CPU supports AVX you need to check CPUID. Here, asm-in-c-error, is an example to read CPUID from all those compilers.
To do this properly I suggest you make a CPU dispatcher.
Edit: In case anyone wants to know how to use the values from CPUID to find out if AVX is available see https://github.com/Mysticial/FeatureDetector
I assume you are using Intel C++ Compiler. In this case - yes, there are such macros: Intel C++ Compiler Reference Guide: __AVX__, __AVX2__.
P.S. Be aware that if you compile you application with AVX instruction set enabled it will fail on CPUs not supporting AVX. If you are going to distribute your software as source code package and compile on target machine - this is may be a viable solution. Otherwise you should check for AVX dynamically.
P.P.S. There are several options for ICC. Take a look at the following compiler options and also references from it to other.
It seems to me that the only way is to compile and run a program that identifies whether AVX is available. Then manually or automatically compile separate code with or without AVX functions. For VS 2013, I would used my code in commomAVX folder in the following to identify hasAVX (or not) and use this to execute one of two different BAT files to compile and link the appropriate program.
http://www.roylongbottom.org.uk/gigaflops-benchmarks.zip
My question was to help to identify a solution regarding the use of suitable compile options such as /arch:AVX.

Getting Intel-syntax asm output from icc, instead of the default AT&T syntax?

I am stuck at a problem. I've been using gcc to compile/assemble my C code for a while and got used to reading Intel assembly syntax. I used the -masm=intel flag when generating the assembly files.
Yet recently, due to company migrations, they obtained Intel's icc, claiming it is better. So now I need to use icc, but it was strange that it has the default assembly syntax as AT&T. I tried to change it but it didn't work, so I contacted Intel support and they also don't know and each person gave me a contradicting answer.
Is there a way to integrate gcc and icc so that I use icc's compiling "superiority" while at the same time compiling to intel's syntax with gcc?
I am using ubuntu and got the icc version 12.x
This flag?
-use_msasm Support Microsoft style assembly language insertion
using MASM style syntax and, if requested, output assem-
bly in MASM format
https://web.archive.org/web/20120728043315/http://amath.colorado.edu/computing/software/man/icc.html
It seems that -masm=intel works in ICC just like Clang and GCC, at least in the current latest version in Compiler Explorer (13.0.1). I tried loading the sum over array example and it generates the below assembly
testFunction(int*, int):
xor eax, eax #2.11
test esi, esi #3.23
jle ..B1.18 # Prob 50% #3.23
movsxd rdx, esi #3.3
...
whereas specifying -use_msasm like in Steve-o's answer doesn't work at all
The official man page from Intel said that it's -use-msasm and not -use_msasm but that doesn't work either
-use-msasm (i32, i32em only)
Support Microsoft* style assembly language insertion using MASM style syntax and, if requested, output assembly in MASM format.
Note: GNU inline assembler (asm) code and Microsoft inline assembler (msasm) code cannot be used together in the same translation unit.
However that's for ICC 9.x in 2006 which was too long ago, and the option might have been changed somewhere between 9.x and 13.x
I dug a little bit further and realized that at least since ICC 16.0 the option is only for assembly blocks in source code and not for outputting Intel syntax
use-msasm
Enables the use of blocks and entire functions of assembly code within a C or C++ file.
Description
This option enables the use of blocks and entire functions of assembly code within a C or C++ file.
It allows a Microsoft* MASM-style inline assembly block not a GNU*-style inline assembly block.
Alternate Options
-fasm-blocks
As you can see it's just an alias for -fasm-blocks. Moreover the -use-asm option was deprecated although I don't know the fate of -use-msasm
References
Intel® C++ Compiler for Linux* - 9.x manuals
Intel® C++ Compiler 16.0 User and Reference Guide
Intel® C++ Compiler 17.0 Developer Guide and Reference

Cygwin gcc - asm error:

I have a project written in C that originally was being done on Linux, but now must be done on Windows. Part of the code include this line in several places
asm("movl temp, %esp");
But that causes an "undefined reference to `temp'" error.
This has no problem compiling on Linux using the gcc 4.3.2 compiler (tested on another machine), which is the version I have on Cygwin. Is there another way to accomplish what this line is doing?
You need to change the cygwin version from
asm("movl temp, %esp");
to
asm("movl _temp, %esp");
Yes, it's the same compiler and assembler but they are set up differently for compatibility with the host system.
You can isolate the system-dependent symbol prefixing by simply telling gcc a specific name to use:
int *temp asm("localname");
...
__asm__("movl localname,%esp");
This avoids an #if of some sort and removes a host OS dependency but adds a compiler dependency. Speaking of compiler extensions, some people would write (see info as) something like:
#ifdef __GNUC__
__asm__("movl %[newbase],%%esp"
:
: [newbase] "r,m" (temp)
: "%esp");
#else
#error haven't written this yet
#endif
The idea here is that this syntax allows the compiler to help you out, by finding temp, even if it's lying about in a register or takes a few instructions to load, and also to avoid conflicting with you, it case it was using a register you clobbered. It needs this because a plain __asm__() is not parsed in any way by the compiler.
In your case, you seem to be implementing your own threading package, and so none of this really matters. Gcc wasn't about to use %esp for a calculation. (But why not just use pthreads...?)

Resources