Difference between MOV and CPY instruction in ARM ISA - arm

What is the difference between MOV and CPY instruction in ARM ISA?
I cannot seem to find valid difference.

Which ARM core?
According to for ARMv7: https://developer.arm.com/documentation/ddi0406/cb/Application-Level-Architecture/Instruction-Details/Alphabetical-list-of-instructions/CPY
They are synonyms, meaning they translate to the same instruction.

Related

GCC Loop Increment Optimization -O2 and above

I've been doing some research involving optimization and loop unrolling and I've been looking at the generated assembly code for different optimization levels. I've come across a weird optimization strategy that gcc uses at -O2 and above. I was wondering if there was a name for this. Here's the generated assembly code:
mov %rsi,(%rcx)
mov %rsi,0x8(%rcx)
mov %rsi,0x10(%rcx)
mov %rsi,0x18(%rcx)
sub $0xffffffffffffff80,%rcx // What is this called?
mov %rsi,-0x60(%rcx)
mov %rsi,-0x58(%rcx)
mov %rsi,-0x50(%rcx)
mov %rsi,-0x48(%rcx)
mov %rsi,-0x38(%rcx)
mov %rsi,-0x30(%rcx)
mov %rsi,-0x28(%rcx)
mov %rsi,-0x20(%rcx)
mov %rsi,-0x18(%rcx)
mov %rsi,-0x10(%rcx)
mov %rsi,-0x8(%rcx)
cmp %rdx,%r8
0xffffffffffffff80 is -128 in 64-bit signed integers (I think). RCX is a scratch register, probably being used from some sort of pointer in this case. As it's subtracting, the -128 becomes +128. The compiler issues a set of instructions decrementing the offset by jumps of 64-bits. This indicates that the compiler is using loop-unrolling on blocks of 128-bits with a very specific set of operations all of which involve subtractions which may possibly be faster on your specific processor. It would be interesting to know your CPU type and the source code that produced this assembly.

what is this assembly code in C?

I have some assembly code below that I don't really understand. My thoughts are that it is meaningless. Unfortunately I can't provide any more instruction information. What would the output in C be?
0x1000: iretd
0x1001: cli
0x1002: in eax, dx
0x1003: inc byte ptr [rdi]
0x1005: add byte ptr [rax], al
0x1007: add dword ptr [rbx], eax
0x1009: add byte ptr [rax], al
0x100b: add byte ptr [rdx], 0
0x100e: add byte ptr [rax], al
Thanks
The first four bytes (if I did reconstruct them correctly) form 32 bit value 0xFEEDFACF.
Putting that into google led me to:
https://gist.github.com/softboysxp/1084476#file-gistfile1-asm-L15
%define MH_MAGIC_64 0xfeedfacf
Aren't you by accident disassembling Mach-o x64 executable from Mac OS X as raw machine code, instead of reading meta data of file correctly, and disassembling only code section?
P.S. in questions like this one, rather include also the source machine code data, so experienced people may check disassembly by targetting different platform, like 32b x86 or 16b real mode code, or completely different CPU, which may help in case you would mistakenly treat machine code with wrong target platform disassembly. I had to first assemble your disassembly to see the raw bytes.
iretd is "return from interrupt", cli is "clear interrupt flag" which means disable all maskable interrupts. The C language does not understand the concept of an interrupt, so it is unlikely that this was compiled from C. In fact, this isn't a single complete fragment of code.
Also add byte ptr [rdx], 0 is adding 0 to a value which doesn't make sense to me unless it is the result of an unoptimised compilation or the result of disassembling something that isn't code.

What does an lea instruction after a ret instruction mean?

I found x86 lea instructions in an executable file made using clang and gcc.
The lea instructions are after the ret instruction as shown below.
0x???????? <func>
...
pop %ebx
pop %ebp
ret
lea 0x0(%esi,%eiz,1),%esi
lea 0x0(%edi,%eiz,1),%edi
0x???????? <next_func>
...
What are these lea instructions used for? There is no jmp instruction to the lea instructions.
My environment is Ubuntu 12.04 32-bit and gcc 4.6.3.
It's probably not anything--it's just padding to let the next function start at an address that's probably a multiple of at least 8 (and quite possibly 16).
Depending on the rest of the code, it's possible that it's actually a table. Some implementations of a switch statement, for example, use a constant table that's often stored in the code segment (even though, strictly speaking, it's more like data than code).
The first is a lot more likely though. As an aside, such space is often filled with 0x03 instead. This is a single-byte debug-break instruction, so if some undefined behavior results in attempting to execute that code, it immediately stops execution and breaks to the debugger (if available).

Intel x86 to ARM assembly conversion

I am currently learning ARM assembly language;
To do so, I am trying to convert some x86 code (AT&T Syntax) to ARM assembly (Intel Syntax) code.
__asm__("movl $0x0804c000, %eax;");
__asm__("mov R0,#0x0804c000");
From this document, I learn that in x86 the Chunk 1 of the heap structure starts from 0x0804c000. But I when I try do the same in arm,
I get the following error:
/tmp/ccfNZp9F.s:174: Error: invalid constant (804c000) after fixup
I am assuming the problem is that ARM can only load 32bit instructions.
Question 1: Any idea what would be the first chunk in case of ARM processors?
Question 2:
From my previous question, I know how memory indirect addressing works.
Are the snippets written below doing the same job?
movl (%eax), %ebx
LDR R0,[R1]
I am using ARMv7 Processor rev 4 (v7l)
Trying to learn arm by looking at x86 is not a good idea one is CISC and quite ugly the other is RISC and much cleaner.. Just learn ARM by looking at the instruction set reference in the architectural reference manual. Look up the mov instruction the add instruction, etc.
ARM doesnt use intel syntax it uses ARM syntax.
Dont learn by using inline assembly, write real assembly. Use an instruction set simulator first not hardware.
ARM, Mips and others aim for fixed word length. So how would you for example fit an instruction that says move some immediate to a register, specify the register, and fit the 32 bit immediate all in 32 bits? not possible. So for fixed length instruction sets you cannot simply load any immediate you want into any register. You must read up on the rules for that instruction set. mips allows for 16 bit immediates, arm for 8 plus or minus depending on the flavor of arm instruction set and the instruction. mips limits where you can put those 16 bits either high or low, arm lets you put those 8 bits anywhere in the 32 bit register depending on the flavor of arm instruction set (arm, thumb, thumb2 extensions).
As with most assembly languages you can solve this problem by doing something like this
ldr r0,my_value
...
my_value: .word 0x12345678
With CISC that immediate is simply tacked onto the instruciton, so whether it 0 bytes a way or 20 bytes away it is still there with either approach.
ARM assemblers also generally allow you this shortcut:
ldr r0,=something
...
something:
which says load r0 with the ADDRESS of something, not the contents at that location but the address (like an lea)
But that lends itself to this immediate shortcut
ldr r0,=0x12345678
which if supported by the assembler will allocate a memory location to hold the value and generate a ldr r0,[pc,offset] instruction to read it. If the immediate is within the rules for a mov then the assembler might optimize it into a mov rd,#immediate.
Answer to Question 1
The MOV instruction on ARM only has 12 bits available for an immediate value, and those bits are used this way: 8 bits for value, and 4 bits to specify the number of rotations to the right (the number of rotations is multiplied by 2, to increase the range).
This means that only a limited number of values can be used with that instruction. They are:
0-255
256, 260, 264,..., 1020
1024, 1040, 1056, ..., 4080
etc
And so on. You are getting that error because your constant can't be created using the 8 bits + rotations. You can load that value onto the register following instruction:
LDR r0, =0x0804c000
Notice that this is a pseudo-instruction though. The assembler will basically put that constant somewhere in your code and load it as a memory location with some offset to the PC (program counter).
Answer to question 2
Yes those instructions are equivalent.

Displaying PSW content

I'm beginner with asm, so I've been researching for my question for a while but answears were unsatisfactory. I'm wondering how to display PSW content on standard output. Other thing, how to display Instruction Pointer value ? I would be very gratefull if ypu could give me a hint (or better a scratch of code). It may be masm or 8086 as well (actually I don't know wthat is the difference :) )
The instruction pointer is not directly accessible on the x86 family, however, it is quite straightforward to retrieve its value - it will never be accurate though.
Since a subroutine call places the return address on the stack, you just need to copy it from there and violá! You have the address of the opcode following the call instruction:
proc getInstructionPointer
push bp
mov bp,sp
mov ax,[word ptr ss:bp + 2]
mov sp,bp
pop bp
ret
endp getInstructionPointer
The PSW on the x86 is called the Flags register. There are two operations that explicitly reference it: pushf and popf. As you might have guessed, you can simply push the Flags onto the stack and load it to any general purpose register you like:
pushf
pop ax
Displaying these values consists of converting their values to ASCII and writing them onto the screen. There are several ways of doing this - search for "string output assembly", I bet you find the answer.
To dispel a minor confusion: 8086 is the CPU itself, whereas MASM is the assembler. The syntax is assembler-specific; MASM assembly is x86 assembly. TASM assembly is x86 assembly as well, just like NASM assembly.
When one says "x86 Assembly", he/she is referencing any of these (or others), talking about the instruction set, not the dialect.
Note that the above examples are 16bit, indtended for 8086 and won't work on 80386+ in 32bit mode

Resources