::"r" vs :"=r" assembly clarification

::"r" vs :"=r" assembly clarification - c

I'm trying to understand the syntax of writing in assembly to first write the code correctly and second efficiently.
in this example it shows the example using "=r"
asm volatile ("MRS %0, PMUSERENR_EL0\n": "=r"(value));
this reads the value of the register and stores it in the value variable. The other example uses ::"r"
asm volatile ("MSR PMUSERENR_EL0, %0\n":: "r"(value));
This writes the value variable to the PMUSERENR_ELO register. Here is another example of it:How to measure program execution time in ARM Cortex-A8 processor?.
When I attempt to compile a simple test code with the two above commands i get the error: :9:2: error: output operand constraint lacks '=' If I add the "=" and remove one ":" it will compile but when I test it, it simply says Illegal instruction
If someone could please explain the difference that would be helpful numerous assembly tutorials show the same format but no explanation. Its on a 64 bit arm platform if that offers any insight. Thanks.

Found the answer in the book: Professional Assembly Language: Extended ASM
If no output values are associated with the assembly code, the section
must be blank, but two colons must still separate the assembly code
from the input operands.
The why is because that's the standard. One colon for outputs and two for inputs.

You have rightly pointed to the explanation, but wrongly formatted the statement "One colon for outputs and two for inputs"
In the extend version of asm format as shown below
asm("assembly code" : outputs : inputs : clobbers)
if you do not have outputs, you should still include the Colon, which translates as follows
asm("assembly code"::inputs:clobbers)
If you do not have clobbers (changed registers) you can omit the last Colon, which translates as
asm("assembly code":outputs:inputs)

Related

What's the meaning of __asm ("LOS_##_ns")

I'm trying to gain some insight on how Apples OS signpost implementation works. I'm working with the C API (there is also a Swift API). My ultimate goal is trying to build a RAII style C++ wrapper class for them, which is harder as it might seem.
Expanding the os_signpost_emit_with_type macro reveals that it creates static strings from the string literals passed to that macro that look like this:
__attribute__((section("__TEXT,__oslogstring,cstring_literals"), internal_linkage)) static const char string_name[] __asm (OS_STRINGIFY(OS_CONCAT(LOS_##_ns, __COUNTER__))) = "string literal";
These strings will later appear as names for the signposts in the instruments profiler. What I get from reading that code, is that the string is placed in a specific section of the binary so that the profiler can find it. What's confusing me is the __asm statement before the assignment. Obviously via the __COUNTER__ macro, it expands to something like __asm ("LOS_##_ns0"), __asm ("LOS_##_ns1") with the number being unique for every string. I have very little in depth knowledge when it comes to assembly, I tried to research the meaning of that statement a bit but got no useful results.
My try-and-error testing revealed that the uniqueness of that numerical appendix generated by the __COUNTER__ macro matters, if two duplicated values occur the string with that duplicated value will shadow the other one in the profiler output.
Can anyone with assembly know how explain what's going on here to a C++ developer like me?
Bonus question: Would there be any way to generate that instruction from within C++ code where the unique numerical value here generated by __COUNTER__would be taken from some variable?

A general note: for information on clang extensions, you generally have to refer to the gcc documentation instead. clang aims to be compatible with gcc and so they didn't bother to write independent docs.
So in your example, a few different extensions are being used. Note that none of them are part of standard C or C++.
__attribute__((section ("foo")) places the variable in the section named foo, by having the compiler emit a .section directive into the assembly before placing the label for the variable. See https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes. It sounds like you already know about this.
asm in a declaration isn't really inline assembly per se; it simply tells the compiler what symbol name to use for this variable when it emits the assembly code. The __asm is just a variant spelling of asm. See https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Asm-Labels.html#Asm-Labels. So int foo asm("bar") = 7; defines a variable which will be referred to as foo in C source, but whose label in assembly will be named bar.
__COUNTER__ is a special macro defined by the gcc/clang preprocessor that simply increments every time it is expanded. See https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html#Common-Predefined-Macros

Why can't local variable be used in GNU C basic inline asm statements?

Why cannot I use local variables from main to be used in basic asm inline? It is only allowed in extended asm, but why so?
(I know local variables are on the stack after return address (and therefore cannot be used once the function return), but that should not be the reason to not use them)
And example of basic asm:
int a = 10; //global a
int b = 20; //global b
int result;
int main() {
asm ( "pusha\n\t"
"movl a, %eax\n\t"
"movl b, %ebx\n\t"
"imull %ebx, %eax\n\t"
"movl %eax, result\n\t"
"popa");
printf("the answer is %d\n", result);
return 0;
}
example of extended:
int main (void) {
int data1 = 10; //local var - could be used in extended
int data2 = 20;
int result;
asm ( "imull %%edx, %%ecx\n\t"
"movl %%ecx, %%eax"
: "=a"(result)
: "d"(data1), "c"(data2));
printf("The result is %d\n",result);
return 0;
}
Compiled with:
gcc -m32 somefile.c
platform:
uname -a:
Linux 5.0.0-32-generic #34-Ubuntu SMP Wed Oct 2 02:06:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

You can use local variables in extended assembly, but you need to tell the extended assembly construct about them. Consider:
#include <stdio.h>
int main (void)
{
int data1 = 10;
int data2 = 20;
int result;
__asm__(
" movl %[mydata1], %[myresult]\n"
" imull %[mydata2], %[myresult]\n"
: [myresult] "=&r" (result)
: [mydata1] "r" (data1), [mydata2] "r" (data2));
printf("The result is %d\n",result);
return 0;
}
In this [myresult] "=&r" (result) says to select a register (r) that will be used as an output (=) value for the lvalue result, and that register will be referred to in the assembly as %[myresult] and must be different from the input registers (&). (You can use the same text in both places, result instead of myresult; I just made it different for illustration.)
Similarly [mydata1] "r" (data1) says to put the value of expression data1 into a register, and it will be referred to in the assembly as %[mydata1].
I modified the code in the assembly so that it only modifies the output register. Your original code modifies %ecx but does not tell the compiler it is doing that. You could have told the compiler that by putting "ecx" after a third :, which is where the list of “clobbered” registers goes. However, since my code lets the compiler assign a register, I would not have a specific register to list in the clobbered register. There may be a way to tell the compiler that one of the input registers will be modified but is not needed for output, but I do not know. (Documentation is here.) For this task, a better solution is to tell the compiler to use the same register for one of the inputs as the output:
__asm__(
" imull %[mydata1], %[myresult]\n"
: [myresult] "=r" (result)
: [mydata1] "r" (data1), [mydata2] "0" (data2));
In this, the 0 with data2 says to make it the same as operand 0. The operands are numbered in the order they appear, starting with 0 for the first output operand and continuing into the input operands. So, when the assembly code starts, %[myresult] will refer to some register that the value of data2 has been placed in, and the compiler will expect the new value of result to be in that register when the assembly is done.
When doing this, you have to match the constraint with how a thing will be used in assembly. For the r constraint, the compiler supplies some text that can be used in assembly language where a general processor register is accepted. Others include m for a memory reference, and i for an immediate operand.

There is little distinction between "Basic asm" and "Extended asm"; "basic asm" is just a special case where the __asm__ statement has no lists of outputs, inputs, or clobbers. The compiler does not do % substitution in the assembly string for Basic asm. If you want inputs or outputs you have to specify them, and then it's what people call "extended asm".
In practice, it may be possible to access external (or even file-scope static) objects from "basic asm". This is because these objects will (respectively may) have symbol names at the assembly level. However, to perform such access you need to be careful of whether it is position-independent (if your code will be linked into libraries or PIE executables) and meets other ABI constraints that might be imposed at linking time, and there are various considerations for compatibility with link-time optimization and other transformations the compiler may perform. In short, it's a bad idea because you can't tell the compiler that a basic asm statement modified memory. There's no way to make it safe.
A "memory" clobber (Extended asm) can make it safe to access static-storage variables by name from the asm template.
The use-case for basic asm is things that modify the machine state only, like asm("cli") in a kernel to disable interrupts, without reading or writing any C variables. (Even then, you'd often use a "memory" clobber to make sure the compiler had finished earlier memory operations before changing machine state.)
Local (automatic storage, not static ones) variables fundamentally never have symbol names, because they don't exist in a single instance; there's one object per live instance of the block they're declared in, at runtime. As such, the only possible way to access them is via input/output constraints.
Users coming from MSVC-land may find this surprising since MSVC's inline assembly scheme papers over the issue by transforming local variable references in their version of inline asm into stack-pointer-relative accesses, among other things. The version of inline asm it offers however is not compatible with an optimizing compiler, and little to no optimization can happen in functions using that type of inline asm. GCC and the larger compiler world that grew alongside C out of unix does not do anything similar.

You can't safely use globals in Basic Asm statements either; it happens to work with optimization disabled but it's not safe and you're abusing the syntax.
There's very little reason to ever use Basic Asm. Even for machine-state control like asm("cli") to disable interrupts, you'd often want a "memory" clobber to order it wrt. loads / stores to globals. In fact, GCC's https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended page recommends never using Basic Asm because it differs between compilers, and GCC might change to treating it as clobbering everything instead of nothing (because of existing buggy code that makes wrong assumptions). This would make a Basic Asm statement that uses push/pop even more inefficient if the compiler is also generating stores and reloads around it.
Basically the only use-case for Basic Asm is writing the body of an __attribute__((naked)) function, where data inputs/outputs / interaction with other code follows the ABI's calling convention, instead of whatever custom convention the constraints / clobbers describe for a truly inline block of code.
The design of GNU C inline asm is that it's text that you inject into the compiler's normal asm output (which is then fed to the assembler, as). Extended asm makes the string a template that it can substitute operands into. And the constraints describe how the asm fits into the data-flow of the program logic, as well as registers it clobbers.
Instead of parsing the string, there is syntax that you need to use to describe exactly what it does. Parsing the template for var names would only solve part of the language-design problem that operands need to solve, and would make the compiler's code more complicated. (It would have to know more about every instruction to know whether memory, register, or immediate was allowed, and stuff like that. Normally its machine-description files only need to know how to go from logical operation to asm, not the other direction.)
Your Basic asm block is broken because you modify C variables without telling the compiler about it. This could break with optimization enabled (maybe only with more complex surrounding code, but happening to work is not the same thing as actually safe. This is why merely testing GNU C inline asm code is not even close to sufficient for it to be future proof against new compilers and changes in surrounding code). There is no implicit "memory" clobber. (Basic asm is the same as Extended asm except for not doing % substitution on the string literal. So you don't need %% to get a literal % in the asm output. It's implicitly volatile like Extended asm with no outputs.)
Also note that if you were targeting i386 MacOS, you'd need _result in your asm. result only happens to work because the asm symbol name exactly matches the C variable name. Using Extended asm constraints would make it portable between GNU/Linux (no leading underscore) vs. other platforms that do use a leading _.
Your Extended asm is broken because you modify an input ("c") (without telling the compiler that register is also an output, e.g. an output operand using the same register).
It's also inefficient: if a mov is the first or last instruction of your template, you're almost always doing it wrong and should have used better constraints.
Instead, you can do:
asm ("imull %%edx, %%ecx\n\t"
: "=c"(result)
: "d"(data1), "c"(data2));
Or better, use "+r"(data2) and "r"(data1) operands to give the compiler free choice when doing register allocation instead of potentially forcing the compiler to emit unnecessary mov instructions. (See #Eric's answer using named operands and "=r" and a matching "0" constraint; that's equivalent to "+r" but lets you use different C names for the input and output.)
Look at the asm output of the compiler to see how code-gen happened around your asm statement, if you want to make sure it was efficient.
Since local vars don't have a symbol / label in the asm text (instead they live in registers or at some offset from the stack or frame pointer, i.e. automatic storage), it can't work to use symbol names for them in asm.
Even for global vars, you want the compiler to be able to optimize around your inline asm as much as possible, so you want to give the compiler the option of using a copy of a global var that's already in a register, instead of getting the value in memory in sync with a store just so your asm can reload that.
Having the compiler try to parse your asm and figure out which C local var names are inputs and outputs would have been possible. (But would be a complication.)
But if you want it to be efficient, you need to figure out when x in the asm can be a register like EAX, instead of doing something braindead like always storing x into memory before the asm statement, and then replacing x with 8(%rsp) or whatever. If you want to give the asm statement control over where inputs can be, you need constraints in some form. Doing it on a per-operand basis makes total sense, and means the inline-asm handling doesn't have to know that bts can take an immediate or register source but not memory, for and other machine-specific details like that. (Remember; GCC is a portable compiler; baking a huge amount of per-machine info into the inline-asm parser would be bad.)
(MSVC forces all C vars in _asm{} blocks to be memory. It's impossible to use to efficiently wrap a single instruction because the input has to bounce through memory, even if you wrap it in a function so you can use the officially-supported hack of leaving a value in EAX and falling off the end of a non-void function. What is the difference between 'asm', '__asm' and '__asm__'? And in practice MSVC's implementation was apparently pretty brittle and hard to maintain, so much so that they removed it for x86-64, and it was documented as not supported in function with register args even in 32-bit mode! That's not the fault of the syntax design, though, just the actual implementation.)
Clang does support -fasm-blocks for _asm { ... } MSVC-style syntax where it parses the asm and you use C var names. It probably forces inputs and outputs into memory but I haven't checked.
Also note that GCC's inline asm syntax with constraints is designed around the same system of constraints that GCC-internals machine-description files use to describe the ISA to the compiler. (The .md files in the GCC source that tell the compiler about an instruction to add numbers that takes inputs in "r" registers, and has the text string for the mnemonic. Notice the "r" and "m" in some examples in https://gcc.gnu.org/onlinedocs/gccint/RTL-Template.html).
The design model of asm in GNU C is that it's a black-box for optimizer; you must fully describe the effects of the code (to the optimizer) using constraints. If you clobber a register, you have to tell the compiler. If you have an input operand that you want to destroy, you need to use a dummy output operand with a matching constraint, or a "+r" operand to update the corresponding C variable's value.
If you read or write memory pointed-to by a register input, you have to tell the compiler. How can I indicate that the memory *pointed* to by an inline ASM argument may be used?
If you use the stack, you have to tell the compiler (but you can't, so instead you have to avoid stepping on the red-zone :/ Using base pointer register in C++ inline asm) See also the inline-assembly tag wiki
GCC's design makes it possible for the compiler to give you an input in a register, and use the same register for a different output. (Use an early-clobber constraint if that's not ok; GCC's syntax is designed to efficiently wrap a single instruction that reads all its inputs before writing any of its outputs.)
If GCC could only infer all of these things from C var names appearing in asm source, I don't think that level of control would be possible. (At least not plausible.) And there'd probably be surprising effects all over the place, not to mention missed optimizations. You only ever use inline asm when you want maximum control over things, so the last thing you want is the compiler using a lot of complex opaque logic to figure out what to do.
(Inline asm is complex enough in its current design, and not used much compared to plain C, so a design that requires very complex compiler support would probably end up with a lot of compiler bugs.)
GNU C inline asm isn't designed for low-performance low-effort. If you want easy, just write in pure C or use intrinsics and let the compiler do its job. (And file missed-optimization bug reports if it makes sub-optimal code.)

This is because asm is a defined language which is common for all compilers on the same processor family. After using the __asm__ keyword, you can reliably use any good manual for the processor to then start writing useful code.
But it does not have a defined interface for C, and lets be honest, if you don't interface your assembler with your C code then why is it there?
Examples of useful very simple asm: generate a debug interrupt; set the floating point register mode (exceptions/accuracy);
Each compiler writer has invented their own mechanism to interface to C. For example in one old compiler you had to declare the variables you want to share as named registers in the C code. In GCC and clang they allow you to use their quite messy 2-step system to reference an input or output index, then associate that index with a local variable.
This mechanism is the "extension" to the asm standard.
Of course, the asm is not really a standard. Change processor and your asm code is trash. When we talk in general about sticking to the c/c++ standards and not using extensions, we don't talk about asm, because you are already breaking every portability rule there is.
Then, on top of that, if you are going to call C functions, or your asm declares functions that are callable by C then you will have to match to the calling conventions of your compiler. These rules are implicit. They constrain the way you write your asm, but it will still be legal asm, by some criteria.
But if you were just writing your own asm functions, and calling them from asm, you may not be constrained so much by the c/c++ conventions: make up your own register argument rules; return values in any register you want; make stack frames, or don't; preserve the stack frame through exceptions - who cares?
Note that you might still be constrained by the platform's relocatable code conventions (these are not "C" conventions, but are often described using C syntax), but this is still one way that you can write a chunk of "portable" asm functions, then call them using "extended" embedded asm.

How to specify clobbered bottom of the x87 FPU stack with extended gcc assembly?

In a codebase of ours I found this snippet for fast, towards-negative-infinity1 rounding on x87:
inline int my_int(double x)
{
int r;
#ifdef _GCC_
asm ("fldl %1\n"
"fistpl %0\n"
:"=m"(r)
:"m"(x));
#else
// ...
#endif
return r;
}
I'm not extremely familiar with GCC extended assembly syntax, but from what I gather from the documentation:
r must be a memory location, where I'm writing back stuff;
x must be a memory location too, whence the data comes from.
there's no clobber specification, so the compiler can rest assured that at the end of the snippet the registers are as he left them.
Now, to come to my question: it's true that in the end the FPU stack is balanced, but what if all the 8 locations were already in use and I'm overflowing it? How can the compiler know that it cannot trust ST(7) to be where it left it? Should some clobber be added?
Edit I tried to specify st(7) in the clobber list and it seems to affect the codegen, now I'll wait for some confirmation of this fact.
As a side note: looking at the implementation of the barebones lrint both in glibc and in MinGW I see something like
__asm__ __volatile__ ("fistpl %0"
: "=m" (retval)
: "t" (x)
: "st");
where we are asking for the input to be placed directly in ST(0) (which avoids that potentially useless fldl); what is that "st" clobber? The docs seems to mention only t (i.e. the top of the stack).
yes, it depends from the current rounding mode, which in our application should always be "towards negative infinity".

looking at the implementation of the barebones lrint both in glibc and in MinGW I see something like
__asm__ __volatile__ ("fistpl %0"
: "=m" (retval)
: "t" (x)
: "st");
where we are asking for the input to be placed directly in ST(0) (which avoids that potentially useless fldl)
This is actually the correct way to represent the code you want as inline assembly.
To get the most optimal possible code generated, you want to make use of the inputs and outputs. Rather than hard-coding the necessary load/store instructions, let the compiler generate them. Not only does this introduce the possibility of eliding potentially unnecessary instructions, it also means that the compiler can better schedule these instructions when they are required (that is, it can interleave the instruction within a prior sequence of code, often minimizing its cost).
what is that "st" clobber? The docs seems to mention only t (i.e. the top of the stack).
The "st" clobber refers to the st(0) register, i.e., the top of the x87 FPU stack. What Intel/MASM notation calls st(0), AT&T/GAS notation generally refers to as simply st. And, as per GCC's documentation for clobbers, the items in the clobber list are "either register names or the special clobbers" ("cc" (condition codes/flags) and "memory"). So this just means that the inline assembly clobbers (overwrites) the st(0) register. The reason why this clobber is necessary is that the fistpl instruction pops the top of the stack, thus clobbering the original contents of st(0).
The only thing that concerns me regarding this code is the following paragraph from the documentation:
Clobber descriptions may not in any way overlap with an input or output operand. For example, you may not have an operand describing a register class with one member when listing that register in the clobber list. Variables declared to live in specific registers (see Explicit Register Variables) and used as asm input or output operands must have no part mentioned in the clobber description. In particular, there is no way to specify that input operands get modified without also specifying them as output operands.
When the compiler selects which registers to use to represent input and output operands, it does not use any of the clobbered registers. As a result, clobbered registers are available for any use in the assembler code.
As you already know, the t constraint means the top of the x87 FPU stack. The problem is, this is the same as the st register, and the documentation very clearly said that we could not have a clobber that specifies the same register as one of the input/output operands. Furthermore, since the documentation states that the compiler is forbidden to use any of the clobbered registers to represent input/output operands, this inline assembly makes an impossible request—load this value at the top of the x87 FPU stack without putting it in st!
Now, I would assume that the authors of glibc know what they are doing and are more familiar with the compiler's implementation of inline assembly than you or I, so this code is probably legal and legitimate.
Actually, it seems that the unusual case of the x87's stack-like registers forces an exception to the normal interactions between clobbers and operands. The official documentation says:
On x86 targets, there are several rules on the usage of stack-like registers in the operands of an asm. These rules apply only to the operands that are stack-like registers:
Given a set of input registers that die in an asm, it is necessary to know which are implicitly popped by the asm, and which must be explicitly popped by GCC.
An input register that is implicitly popped by the asm must be explicitly clobbered, unless it is constrained to match an output operand.
That fits our case exactly.
Further confirmation is provided by an example appearing in the official documentation (bottom of the linked section):
This asm takes two inputs, which are popped by the fyl2xp1 opcode, and replaces them with one output. The st(1) clobber is necessary for the compiler to know that fyl2xp1 pops both inputs.
asm ("fyl2xp1" : "=t" (result) : "0" (x), "u" (y) : "st(1)");
Here, the clobber st(1) is the same as the input constraint u, which seems to violate the above-quoted documentation regarding clobbers, but is used and justified for precisely the same reason that "st" is used as the clobber in your original code, because fistpl pops the input.
All of that said, and now that you know how to correctly write the code in inline assembly, I have to echo previous commenters who suggested that the best solution would be not to use inline assembly at all. Just call lrint, which not only has the exact semantics that you want, but can also be better optimized by the compiler under certain circumstances (e.g., transforming it into a single cvtsd2si instruction when the target architecture supports SSE).

What is this code trying to do?

I'm trying to understand how the following code is working:
#define M32toX128(x128,m32) __asm__ \
("movddup %1, %0\n\t" \
"movsldup %0, %0" \
: "=&x"(x128) : "m"(m32) )
I have only basic assembly knowledge. Searching around and from the context of the program that is using it, I have understood that it is duplicating a 32-bit variable and storing the result in a 128-bit variable.
My questions are:
What do %0 and %1 refer to?
What are the colons (:) doing?
What is the actual assembly code that is executed? I mean after replacing %ns, "=&x"(x128)...

gcc inline assembly is a complicated beast, too complicated to describe here in detail.
As a quick overview, the general form of the asm block is: "template" : outputs : inputs : clobbers. You can refer to the outputs and inputs (collectively known as operands) in the template by using % followed by a zero-based index. Thus %0 refers to x128 and %1 refers to m32. For each operand, you can specify a constraint that tells the compiler how to allocate said operand. The =&x means, allocate the x128 as an early-clobber output in any available xmm register, and the m means use a memory address for the operand. See the manual for the mind-boggling details.
The actual assembly generated will depend on the operand choices the compiler uses. You can see that if you ask for an assembly listing using the -S option. Assuming m32 is a local variable, the code may look like:
movddup x(%esp), %xmmN
movsldup %xmmN, %xmmN
Note that gcc inline assembler is of course gcc and architecture specific, and that means it's simpler to use the equivalent compiler intrinsics.

This code will be passed from GCC to the assembler stage. Some part of the macro will be replaced in the process. Here is the documentation: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
%0 and %1 will be replaced with the values that you passed to the C macro.
The : is used to separate parts of the macro. The first mandatory part is the template. The second one is for output operands. See http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s5 for the full.
In this case, you have an output = which is 128 bits (x) and which gets trashed (&) by the macro. m means this is a memory operand.

What does %c mean in GCC inline assembly code?

I am trying to understand this inline assembly code which comes from _hypercall0 here.
asm volatile ("call hypercall_page+%c[offset]" \
: "=r" (__res) \
: [offset] "i" (__HYPERVISOR_##name * sizeof(hypercall_page[0])) \
: "memory", "edi", "esi", "edx", "ecx", "ebx", "eax")
I am having trouble finding information on what %c in the first line means. I did not find any information in the most obvious section of the GCC manual, which explains %[name], but not %c[name]. Is there any other place I should look at?

From the GCC internals documentation:
`%cdigit' can be used to substitute an
operand that is a constant value
without the syntax that normally
indicates an immediate operand.

Check the assembly output (with gcc -S, or maybe disassemble the object file) and it may be clearer.
My guess is that it stands for constant. hypercall_page looks like a table of instructions that each do a syscall. Maybe this will generate a call hypercall_page + {constant based on the expression given}, essentially having computed the address of this offset at compile time.
As an aside, this __HYPERVISOR##name stuff really reminds me of the __NR_name_of_syscall type convention you see for syscalls in Linux's <asm/unistd.h> and similar places.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight