Translation of CUDA inline asm from GAS to Intel - c

I have some C-CUDA code that contains inline PTX assembly, which compiles OK on Linux with g++ backend.
I need to build it under Windows, and clearly MSVC backend does not recognize inline asm properly - gives errors like "not an asm string". I assume it has to do with syntax this PTX assembly is written, for example:
asm volatile ("subc.cc.u32 %0, %0, "q2_s";": "+r"(c[2]));
asm volatile ("subc.cc.u32 %0, %0, "q3_s";": "+r"(c[3]));
I don't know much about assembly, and am wondering - is there some translator from GAS(at&t) style to Intel syntax?
Or is there some workaround to build CUDA kernels to PTX on Linux, and then build PTX & link to remaining code on Windows?
I've tried that, but PTX compiler on linux gives kernel functions some unrecognizable _Z-starting names and linker does not know how to link the stuff.

Turns out, problem was not with inline asm, but with preprocessing, e.g. for example, asm string
asm volatile ("subc.cc.u32 %0, %0, "q2_s";": "+r"(c[2]));
relied on this define
#define q2_s "0xAF48A03B"
On Linux it compiled without errors, but on Windows gave "expected an asm string" error.
So, workaround for Windows was just to hardcode hex values in asm strings, and it has nothing to do with assembly syntax, sorry for misguiding.

Related

Equivalent for NOP in C for Embedded?

I use KEIL to compile a program.
The program uses the code
asm("NOP");
Unfortunately KEIL compiler does not accept the statement.
The idea is to introduce a delay by using NOP (no operation) assembly code.
What is the actual equivalent of this in C ? Does this vary with the embedded controller that I use?
There's an intrinsic nop in most compilers, Keil should have this as well - try __nop()
See - https://www.keil.com/support/man/docs/armcc/armcc_chr1359124998347.htm
Intrinsic functions are usually safer than directly adding assembly code for compatibility reasons.
Does this vary with the embedded controller that I use?
Yes. Inline assembly is not part of the C standard (yet), it varies from compiler to compiler and sometimes even between different target architectures of the same compiler. See Is inline asm part of the ANSI C standard? for more information.
For example, for the C51 Keil compiler, the syntax for inline assembly is
...
#pragma asm
NOP
#pragma endasm
...
while for ARM, the syntax is something like
...
__asm {
NOP
}
...
You will need to check the manual for the actual compiler you are using.
For some of the more common opcodes, some compilers provide so-called intrinsics - these can be called like a C function but essentially insert assembly code, like _nop_ ().
If you are using Keil for ARM Cortex target (e.g. stm32), you are most probably also using CMSIS library. It has portable macros and inline functions for all assembly instructions written like this: __NOP().

Trying to port GCC specific asm goto to Clang

I've been trying to turn a bit of GNU extensions in to actual standard C so it'll run on clang, knowing standard C and not GNU extensions, I'm at a bit of a loss.
__asm__ (goto("1:"
STATIC_KEY_INITIAL_NOP
".pushsection __jump_table, \"aw\" \n\t"
_ASM_ALIGN "\n\t"
_ASM_PTR "1b, %l[l_yes], %c0 \n\t"
".popsection \n\t"
: : "i" (key) : : l_yes););
I've tried to turn this in to actual asm, but have yet to be successful.
If you're curious, this is part of a kernel I've just about got to build on clang, besides that one section.
You seem to be having a problem compiling arch/x86/include/asm/jump_label.h. The entire code-snippet is to enable support for "jump label patching". A new feature quite useful to allow debugging(print logs etc.) with have near-zero overhead when debugging is disabled.
The implementation you encounter depends on gcc(v4.5) which adds a new asm goto statement that allows branching to a label.
It appears that Clang/LLVM < 9.0.0 does NOT support asm goto.
As a quick fix to get your Linux kernel compiling properly, you can disable CONFIG_JUMP_LABEL in your kernel configuration. This config option is used to disable the optimisation when the compiler does NOT support asm goto properly.
Update: Initial support for asm goto was added to Clang in v9.0.0.
Initial support for asm goto statements (a GNU C extension) has been
added for control flow from inline assembly to labels. The main consumers of
this construct are the Linux kernel (CONFIG_JUMP_LABEL=y) and glib. There are
still a few unsupported corner cases in Clang's integrated assembler and
IfConverter. Please file bugs for any issues you run into.

Cygwin gcc - asm error:

I have a project written in C that originally was being done on Linux, but now must be done on Windows. Part of the code include this line in several places
asm("movl temp, %esp");
But that causes an "undefined reference to `temp'" error.
This has no problem compiling on Linux using the gcc 4.3.2 compiler (tested on another machine), which is the version I have on Cygwin. Is there another way to accomplish what this line is doing?
You need to change the cygwin version from
asm("movl temp, %esp");
to
asm("movl _temp, %esp");
Yes, it's the same compiler and assembler but they are set up differently for compatibility with the host system.
You can isolate the system-dependent symbol prefixing by simply telling gcc a specific name to use:
int *temp asm("localname");
...
__asm__("movl localname,%esp");
This avoids an #if of some sort and removes a host OS dependency but adds a compiler dependency. Speaking of compiler extensions, some people would write (see info as) something like:
#ifdef __GNUC__
__asm__("movl %[newbase],%%esp"
:
: [newbase] "r,m" (temp)
: "%esp");
#else
#error haven't written this yet
#endif
The idea here is that this syntax allows the compiler to help you out, by finding temp, even if it's lying about in a register or takes a few instructions to load, and also to avoid conflicting with you, it case it was using a register you clobbered. It needs this because a plain __asm__() is not parsed in any way by the compiler.
In your case, you seem to be implementing your own threading package, and so none of this really matters. Gcc wasn't about to use %esp for a calculation. (But why not just use pthreads...?)

gcc inline assembly for context-switching

I am trying to implement context switch using gcc for m68k processors. I need to use inline assembly for saving all the registers d0, d1...d7 and a0,...a7. I was wondering if I can use a loop in my inline gcc that would allow me to save these registers instead of write a separate line of code for each register.
for eg.
move.l %d0, temp
pcb.cpuregs.d0 = temp
i want to make 0 inside d0 like a loop counter.
Here you go:
MOVEM.L D0-D7/A0-A7,-(A7) ;Save registers onto stack.
You don't have to use the stack, you can use some other address.
I have a feeling that the pre-decrement mode is compulsory,
but I can't test that right now as I don't have a 68k machine.
p.s. that's probably not gcc dialect, seeing as gcc didn't exist when
I wrote that code, but I'm sure you can figure it out.
p.p.s why not use setjmp instead of inline assembly?
then your context switcher would be semi-portable.
You may want to consider macros:
#define SAVE_REG_DXX(no) __asm__ __volatile__("move.l %%d" #no ", %0" : "=g" (pcb.cpuregs.d ## no))
SAVE_REG_DXX(0);
SAVE_REG_DXX(1);
SAVE_REG_DXX(2);
#undef SAVE_REG_DXX
You can't use a C-style for loop inside the asm block. But you can use your C code to build a string and pass that along to asm.

x86 inline assembler flag

Silly question, but I just can not find the necessary flag in gcc. Basically, I have in my C program the following inline assembler code
__asm__ __volatile__ ("lea ebx, [timings] \n\t");
When compiling, I get an errormessage which says: Error: invalid char '[' beginning operand 2[timings]'`
Now I remember that a long time ago I used some kind of flag that told the compiler that it is x86 inline assembly. But cant find it online, could maybe someone please tell me which flag i have to use?
Thank you so much!
You can't specify variables that way with GCC. See this document for a detailed description of how to use inline assembler. Also, keep in mind that GCC uses AT&T syntax, not Intel syntax, so you have to put your destinations on the right.
Try using __asm__ instead. Look here for more.
Also, try removing the \n\t from inside the assembly code.

Resources