What's the meaning of the following code? - c

There is a CAS code below which can handle just int type,I know the function of CAS but I don't know the details shown below.
inline int CAS(unsigned long *mem,unsigned long newval,unsigned long oldval)
{
__typeof (*mem) ret;
__asm __volatile ("lock; cmpxchgl %2, %1"
: "=a" (ret), "=m" (*mem)
: "r" (newval), "m" (*mem), "0" (oldval));
return (int) ret;
}
I know there should be five parameters mapped to %0,%1,%2,%3,%4 because there are five parameters in input/output field
I also know that "=a" means using eax register,"=m" means using memory address,"r" means using any register
But I don't understand what the "0" means.
I don't understand why "cmpxchgl" only use two parameters %2, %1 instead of three?
It should use three params as the CAS function.
Where can I get all the infimation about the inline c asm?I need a complete tutorial.

%2 is newval, %1 is *mem
with "0" (oldval), and the first register occur is "=a", means that oldval is stored in eax.
So cmpxchgl %2, %1" means cmpxchgl newval, *mem"(while oldval in eax), which checks eax(value of oldval) whether equals *mem, if equal, change value of *mem to newval.

Related

aarch64 Inline assembly error : operand 2 must be an integer register -- `ldnp x0,[x0]'

I'm trying to write a simple function using in-line assembly and use it in a C program
The mem_io_read is a function that reads a memory address bypassing cache (event though the address is located in a cacheable memory region). It's for aarch64 machine.
static inline int mem_io_read(unsigned long paddr)
{
unsigned long val;
register pa;
__asm__ __volatile__("mov %0, %1\n\t" : "=r" (pa) : "r"(paddr)); <-- move paddr to a register pa
__asm__ __volatile__("ldnp %0, [%1]\n\t" : "=r" (val) : "r" (pa)); <-- load data from addr in pa
return val;
}
main()
{
...
uint32_t SCP_WR_ADDR = &scp_wait; // where test1val was located. //x06000000;
uint32_t chk_scp_rd_data = 0;
// Send flag for proceeding SCP test
(*(volatile uint32_t *)(SCP_WR_ADDR)) = 0x87654321; <-- send signal to the other processor (scp)
// Receives flag from SCP
while(chk_scp_rd_data != 0x12345678) <--- read back until the value is changed (reverse order)
{
chk_scp_rd_data = mem_io_read(SCP_WR_ADDR);
}
}
When I compile this using gcc, I get this error
/tmp/ccCpQGc5.s: Assembler messages:
/tmp/ccCpQGc5.s:26: Error: operand 2 must be an integer register -- `ldnp x0,[x0]'
I can't figure out what is wrong here. Please help.
ADD : from Peter Cordes's comment, I changed it to this one. It is compiled ok.
static int inline mem_io_read(unsigned long paddr)
{
int val, val1;
__asm__ __volatile__("ldnp %0, %1, [%2]\n\t" : "=r" (val), "=r" (val1) : "r" (paddr) : "memory");
return val;
}

Is there a race condition in the linux ARM spinlock?

Here is the Linux implementation of a spinlock from arch/arm/include/asm/spinlock.h:
static inline void arch_spin_lock(arch_spinlock_t *lock)
{
unsigned long tmp;
u32 newval;
arch_spinlock_t lockval;
prefetchw(&lock->slock);
__asm__ __volatile__(
"1: ldrex %0, [%3]\n"
" add %1, %0, %4\n"
" strex %2, %1, [%3]\n"
" teq %2, #0\n"
" bne 1b"
: "=&r" (lockval), "=&r" (newval), "=&r" (tmp)
: "r" (&lock->slock), "I" (1 << TICKET_SHIFT)
: "cc");
while (lockval.tickets.next != lockval.tickets.owner) {
wfe();
lockval.tickets.owner = READ_ONCE(lock->tickets.owner);
}
smp_mb();
}
...
static inline void arch_spin_unlock(arch_spinlock_t *lock)
{
smp_mb();
lock->tickets.owner++;
dsb_sev();
}
My concern is that the following two lines in arch_spin_lock:
while (lockval.tickets.next != lockval.tickets.owner) {
wfe();
are not atomic. So what if arch_spin_unlock was called in between these two lines? This means in the function arch_spin_lock the WFE instruction would be run but the SEV has already been run and won't be run again. So at the very worst arch_spin_lock would wait forever, or until some unrelated event occurs.
Is this correct, or am I misunderstanding something? If it is a problem even only in theory, is there a way to avoid the problem?
I think you are missing this bit of WFE documentation:
If the Event Register is set, WFE clears it and returns immediately.
In the "race" you describe WFE will get executed, but will return immediately, then while loop will exit.

Inline assembly: clarification of constraint modifiers

Two questions:
(1) If I understand ARM inline assembly correctly, a constraint of "r" says that the instruction operand can only be a core register and that by default is a read-only operand. However, I've noticed that if the same instruction has an output operand with the constraint "=r", the compiler may re-use the same register. This seems to violate the "read-only" attribute. So my question is: Does "read-only" refer to the register, or to the C variable that it is connected to?
(2) Is it correct to say that presence of "&" in the constraint of "=&r" simply requires that the register chosen for the output operand must not be the same as one of the input operand registers? My question relates to the code below used to compute the integer power function: i.e., are the "&" constraint modifiers necessary/appropriate?
asm (
" MOV %[power],1 \n\t"
"loop%=: \n\t"
" CBZ %[exp],done%= \n\t"
" LSRS %[exp],%[exp],1 \n\t"
" IT CS \n\t"
" MULCS %[power],%[power],%[base] \n\t"
" MUL %[base],%[base],%[base] \n\t"
" B loop%= \n\t"
"done%=: "
: [power] "+&r" (power)
[base] "+&r" (base)
[exp] "+&r" (exp)
:
: "cc"
) ;
Thanks!
Dan
Read-only refers to the use of the operand in assembly code. The assembly code can only read from the operand, and it must do so before any normal output operand (not an early clobber or a read/write operand) is written. This is because, as you've seen, the same register can be allocated to both an input and output operand. The assumption is that inputs are fully consumed before any output is written, which is normally the case for an assembly instruction.
I don't think using an early-clobber modifier & with an read/write modifier + has any effect since a register allocated to a read/write operand can't be used for anything else.
Here's how I'd write your code:
unsigned power = 1;
asm (
" CBZ %[exp],done%= \n\t"
"loop%=: \n\t"
" LSRS %[exp],%[exp],1 \n\t"
" IT CS \n\t"
" MULCS %[power],%[power],%[base] \n\t"
" MUL %[base],%[base],%[base] \n\t"
" BNE loop%= \n\t"
"done%=: "
: [power] "+r" (power),
[base] "+r" (base),
[exp] "+r" (exp)
:
: "cc"
) ;
Note the transformation of putting the loop test at the end of the loop, saving one instruction. Without it the code doesn't have any obvious improvement over what the compiler can generate. I also let the compiler do the initialization of the register used for the power operand. There's a small chance it will be able to allocate a register that already has the value 1 in it.
Thanks to all of you for the clarification. Just to be sure that I have it right, would it be correct to say that the choice between "=r" and "+r" for an output operand comes down to how the corresponding register is first used in the assembly template? I.e.,
"=r": The first use of the register is as a write-only output of an instruction.
The register may be re-used later by another instruction as an input or output. Adding an early clobber constraint (e.g., "=&r") prevents the compiler from assigning a register that was previously used as an input operand.
"+r": The first use of the register is as an input to an instruction, but the register is used again later as an output.
Best,
Dan

Set register using a variable inline assembly

My requirement is to set EDI register using a variable with inline assembly. I wrote the following snippet but it fails to compile.
uint32_t value = 0;
__asm__ __volatile__("mov %1,%%edi \n\t"
: "=D"
: "ir" (value)
:
);
Errors I get are
cyg_functions.cpp(544): error: expected a "("
: "ir" (value)
^
cyg_functions.cpp(544): internal error: null pointer
: "ir" (value)
Edit
I guess I wasn't clear on the problem specification. Let's say my requirement is as follows.
There are two int variables val and result.
I need to
Set the value of variable val to %%edi clobbering whatever in there already
Multiply %%edi value by 2
Set %%edi value back to result variable
How can this be stated with inline assembly? Though this is not exactly my requirement answer to this (specifically the 1st step) would solve my problem. I need the intermediate to be specifically in EDI register.
I have read your comments, and the requirements here still makes no sense to me. However, making sense is not a requirement. Such being the case:
int main(int argc, char *argv[])
{
int res;
int value = argc;
asm ("shl $1, %[res]" /* Take the value in res (aka EDI) and shift
it left by 1. */
: [res] "=D" (res) /* On exit from the asm, the register EDI
will contain the value for "res". The
existing value of res will be overwritten. */
: "0" (value)); /* Take the contents of "value" and put it
in the same place as parameter #0. */
return res;
}
This may be easier to understand if you read it from the bottom up.

Why CompareAndSwap is more of a powerful instruction than TestAndSet?

Please consider the following piece of code for CompareAndSwap and let me know why this atomic instruction is more powerful than atomic TestAndSet for being a mutual exclusion primitive?
char CompareAndSwap(int *ptr, int old, int new) {
unsigned char ret;
// Note that sete sets a ’byte’ not the word
__asm__ __volatile__ (
" lock\n"
" cmpxchgl %2,%1\n"
" sete %0\n"
: "=q" (ret), "=m" (*ptr)
: "r" (new), "m" (*ptr), "a" (old)
: "memory");
return ret;
}
test-and-set modifies the contents of a memory location and returns its old value as a single atomic operation.
compare-and-swap atomically compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value.

Resources