xorl %eax - Instruction set architecture in IA-32 - c

I am experiencing some difficulties interpreting this exercise;
What does exactly xorl does in this assembly snippet?
C Code:
int i = 0;
if (i>=55)
i++;
else
i--;
Assembly
xorl ____ , %ebx
cmpl ____ , %ebx
Jel .L2
____ %ebx
.L2:
____ %ebx
.L3:
What's happening on the assembly part?

It's probably:
xorl %ebx, %ebx
This is a common idiom for zeroing a register on x86. This would correspond with i = 0 in the C code.
If you are curious "but why ?" the short answer is that the xor instruction is fewer bytes than mov $0, %ebx. The long answer includes other subtle reasons.
I am leaving out the rest of the exercise since there's nothing idiosyncratic left.

This is the completed and commented assembly equivalent to your C code:
xorl %ebx , %ebx ; i = 0
cmpl $54, %ebx
jle .L2 ; if (i <= 54) jump to .L2, otherwise continue with the next instruction (so if i>54... which equals >=55 like in your C code)
addl $2, %ebx ; >54 (or: >=55)
.L2:
decl %ebx ; <=54 (or <55, the else-branch of your if) Note: This code also gets executed if i >= 55, hence why we need +2 above so we only get +1 total
.L3:
So, these are the (arithmetic) instructions that get executed for all numbers >=55:
addl $2, %ebx
decl %ebx
So for numbers >=55, this is equal to incrementing. The following (arithmetic) instructions get executed for numbers <55:
decl %ebx
We jump over the addl $2, %ebx instruction, so for numbers <55 this is equal to decrementing.
In case you're not allowed to type addl $2, (since it's not just the instruction but also an argument) into a single blank there's probably an error in the asm code you've been given (missing a jump between line 4 and 5 to .L3).
Also note that jel is clearly a typo for jle in the question.

XORL is used to initialize a register to Zero, mostly used for the counter. The code from ccKep is correct, only that he incremented by a wrong value ie. 2 instead of 1. The correct version is therefore:
xorl %ebx , %ebx # i = 0
cmpl $54, %ebx # compare the two
jle .L2 #if (i <= 54) jump to .L2, otherwise continue with the next instruction (so if i>54... which equals >=55 like in your C code)
incl %ebx #i++
jmp .DONE # jump to exit position
.L2:
decl %ebx # <=54 (or <55
.DONE:

Related

Count how many 1 there are in the odd positions of an array of 32-bit numbers

I'm implementing a simple assembly function called through a program in c, to count how many 1 there are in the odd position of an array in 32 bits. The first argument passed to the function is the pointer to the array while the second the number of elements. I do not understand what it does go to infinite loop.
// Assembly code
.globl ones_in_odds
.type ones_in_odds,#function
ones_in_odds:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp),%ecx
movl 12(%ebp),%esi
#xorl %edi,%edi
xorl %eax,%eax
subl $-1,%esi
startloop:
cmpl $0,%esi
jl endloop
movl (%ecx,%esi,4),%edx
xorl %edi,%edi
innerloop:
cmpl $16,%edi
jg startloop
shrl %edx
shrl %edx
adcl $00,%eax
incl %edi
jmp innerloop
decl %esi
jmp startloop
endloop:
movl %ebp,%esp
popl %ebp
ret
// C Code
#include <stdio.h>
#define DIM 5
int ones_in_odds(int *,size_t);
int main(){
int array[DIM] = {1,3,4,5,6};
int one = ones_in_odds(array,DIM);
printf("Il numero di 1 nelle posizioni dispari e' %d\n",one);
return 0;
}
This adds 1 to ESI:
subl $-1,%esi
It's like the term ESI = ESI - (-1). You want to transform the count of items (DIM) to the index of the last item in array. Thus the count has to be subtracted by 1 or added by (-1). However, you don't need this transformation at this place. Remove the line.
Because of
jg startloop
you don't reach decl %esi and therefore the break condition jl endloop will never been met.
Move decl %esi to the beginning of startloop and you solve also the transformation problem above.
As #Jester mentions, you have to follow the C calling convention since the assembly procedure is a function of the C program. The function is allowed to change EAX, ECX, and EDX ("caller saved") and has to return all other registers unchanged ("callee saved"). The function changes ESI and EDI, but should returm them unchanged. So push them at the beginning of the function and pop them at the end.

Storing Local variables in Assembly

So I've been working on a problem (and before you ask, yes, it is homework, but I've been putting in faithful effort!) where I have some assembly code and want to be able to convert it (as faithfully as possible) to C.
Here is the assembly code:
A1:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
jmp .L2
.L4:
movl -4(%ebp), %eax
sall $2, %eax
addl 8(%ebp), %eax
movl (%eax), %eax
cmpl 12(%ebp), %eax
jg .L6
.L2:
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L4
jmp .L3
.L6:
nop
.L3:
movl -4(%ebp), %eax
leave
ret
And here's some of the C code I wrote to mimic it:
int A1(int a, int b, int c) {
int local = 0;
while(local < c) {
if(b > (int*)((local << 2) + a)) {
return local;
}
}
return local;
}
I have a few questions about how assembly works.
First, I notice that in L4, the body of the while loop, nothing is ever assigned to local. It's initialized to be 0 at the start of the function, and then never modified again. Looking at the C code I made for it, though, that seems odd, considering that the loop will go on indefinitely if the if-condition fails. Am I missing something there? I was under the impression that you'd need a snippet of code like:
movl %eax, -4(%ebp)
in order to actually assign anything to the local variable, and I don't see anything like that in the body of the while loop.
Secondly, you'll see that in the assembly code, the only local variable that's declared is "local". Hence, I have to use a snippet of code like:
if(b > (int*)((local << 2) + a))
The output of this line doesn't look much like the assembly code, though, and I think I might have made a mistake. What did I do wrong here?
And finally (thanks for your patience!), on a related note, I understand that the purpose of this if-loop in the while loop is to break out if the condition is fulfilled, and then to return local. Hence L6 and "nop" (which is basically saying nothing). However, I don't know how to replicate this in my program. I've tried "break", and I've tried returning local as you see here. I understand the functionality - I just don't know how to replicate it in C (short of using goto, but that kind of defeats the purpose of the exercise...).
Thank you for your time!
This is my guess:
int A1 (int *a, int value, int size)
{
int i = 0;
while (i<size)
{
if (a[i] <= value)
break;
}
return i;
}
Which, compiled back to assembly, gives me this code:
A1:
.LFB0:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
jmp .L2
.L4:
movl -4(%ebp), %eax
leal 0(,%eax,4), %edx
movl 8(%ebp), %eax
addl %edx, %eax
movl (%eax), %eax
cmpl 12(%ebp), %eax
jg .L2
jmp .L3
.L2:
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L4
.L3:
movl -4(%ebp), %eax
leave
ret
Now this seems to be identical to your original ASM code, just the code starting at L4 is not the same, but if we anotate both codes:
ORIGINAL
movl -4(%ebp), %eax ;EAX = local
sall $2, %eax ;EAX = EAX*4
addl 8(%ebp), %eax ;EAX = EAX+a, hence EAX=a+local*4
ASM-C-ASM
movl -4(%ebp), %eax ;EAX = i
leal 0(,%eax,4), %edx ;EDX = EAX*4
movl 8(%ebp), %eax ;EAX = a
addl %edx, %eax ;EAX = EAX+EDX, hence EAX=a+i*4
Both codes continue with
movl (%eax), %eax
Because of this, I guess a is actually a pointer to some variable type that uses 4 bytes. By the comparison between the second argument and the value read from memory, I guess that type must be either int or long. I choose int solely by convenience.
Of course this also means that this code (and the original one) does not make any sense. It lacks the i++ part somewhere. If this is so, then a is an array, and the third argument is the size of the array. I've named my local variable i to keep with the tradition of naming index variables like this.
This code would scan the array searching for a value inside it that is equal or less than value. If it finds it, the index to that value is returned. If not, the size of the array is returned.

Y86 Stopped in 1 Steps Exception HLT

I translated a simple c program into IA32 and then transliterated it into Y86 but I am getting an error I don't understand or know how to debug since I am just learning Y86. The error is:
Stopped in 1 steps at PC = 0x1. Exception 'HLT', CC Z=1 S=0 O=0
Changes to registers:
Changes to memory:
The program is supposed to initialize i to 0 and then proceed through a for loop until i is greater than or equal to 5 and increment i each time. Inside the for loop I set j equal to i*2 and k is equal to j+1. My Y86 code is as follows:
main:
irmovl $0, %ebx
jmp L2
halt
L3:
rrmovl %ebx, %eax
addl %eax, %eax
rrmovl %eax, %ecx
rrmovl %ecx, %eax
irmovl $1, %esi
addl %esi, %eax
rrmovl %eax, %edx
addl %esi, %ebx
L2:
irmovl $4, %edi
subl %edi, %ebx
jle L3
I can provide the C code and IA32 code I transliterated from if it would help you answer my problem I really need some help thanks.
You forgot to add the extra NL (CR) in the source file. YAS is flawed. When it assembles it wraps the assembly around if there is no NL (CR) at the end of the source (ys) file, overwriting the beginning of the created object code (yo) file.
The result... well it is bad. In your case, YAS probably inserted a HLT instruction in the first character of the object file.

How to interpret this IA32 assembly code [duplicate]

This question already has an answer here:
Pointers in Assembly
(1 answer)
Closed 9 years ago.
As an assignment, I must read the assembly code and write its high-level C program. The structure in this case is a switch statement, so for each case, the assembly code is translated to the case in C code. Below will only be one of the cases. If you can help me interpret this, it should give me a better understanding of the rest of the cases.
p1 is in %ebp+8
p2 is in %ebp+12
action is in %ebp+16
result is in %edx
...
.L13:
movl 8(%ebp), %eax # get p1
movl (%eax), %edx # result = *p1?
movl 12(%ebp), %ecx # get p2
movl (%ecx), %eax # p1 = *p2
movl 8(%ebp), %ecx # p2 = p1
movl %eax, (%ecx) # *p1 = *p2?
jmp .L19 #jump to default
...
.L19
movl %edx, %eax # set return value
Of course, the comments were added by me to try to make sense of it, except it leaves me more confused. Is this meant to be a swap? Probably not; the formatting would be different. What really happens in the 2nd and 6th lines? Why is %edx only changed once so early if it's the return value? Please answer with some guidelines to interpreting this code.
The above snippet is x86_32 assembly in the (IMO broken) AT&T syntax.
AT&T syntax attaches a size suffix to every operand.
movl means a 32 bit operand. (l for long)
movw means 16 bit operand (w for word)
movb means an 8 bit operand (b for byte)
The operands are reversed in order, so the destination is on the right and the source is on the left.
This is opposite to almost every other programming language.
Register names are prefixed by a % to distinguish them from variable names.
If a register is surrounded by brackets () that means that the memory address pointed to by the register is used, rather than the value inside the register itself.
This makes sense, because EBP is used as a pointer to a stackframe.
Stackframes are used to access parameters and local variables in functions.
Instead of writing: mov eax, dword ptr [ebp+8] (Intel syntax)
AT&T syntax lists it as: movl 8(%ebp), %eax (gas syntax)
Which means: put the contends of the memory address pointed to by (ebp + 8) into eax.
Here's the translation:
.L13: <<-- label used as a jump target.
movl 8(%ebp), %eax <<-- p1, stored at ebp+8 goes into EAX
movl (%eax), %edx <<-- p1 is a pointer, EDX = p1->next
movl 12(%ebp), %ecx <<-- p2, stored at ebp+12 goes in ECX
movl (%ecx), %eax <<-- p2 is (again) a pointer, EAX = p2->next
movl 8(%ebp), %ecx <<-- ECX = p1
movl %eax, (%ecx) <<-- p2->next = p1->next
jmp .L19 <<-- jump to exit
...
.L19
movl %edx, %eax <<-- EAX is always the return value
<<-- return p1->data.
In all of the many calling conventions on x86 the return value of a function is put into the EAX register. (or EAX:EDX if it's an INT64)
In prose: p1 and p2 are pointers to data, in this data pointers to pointers.
This code looks like it manipulates a linked list.
p2->next is set to p1->next.
Other than that the snippet looks incomplete because whatever was in p2->next to begin with is not worked on so there probably more code that you're not showing.
Apart from the confusing AT&T syntax it's really very simple code.
In C the code would look like:
(void *)p2->next = (void *)p1->next;
Note that the code is quite inefficient and no decent compiler (or human) would generate this code.
The following equivalent would make more sense:
mov eax,[ebp+8]
mov ecx,[ebp+12]
mov eax,[eax]
mov [ecx],eax
jmp done
More info on the difference between AT&T and Intel syntax can be found here: http://www.ibm.com/developerworks/linux/library/l-gas-nasm/index.html

Translate C code to assembly code?

I have to translate this C code to assembly code:
#include <stdio.h>
int main(){
int a, b,c;
scanf("%d",&a);
scanf("%d",&b);
if (a == b){
b++;
}
if (a > b){
c = a;
a = b;
b = c;
}
printf("%d\n",b-a);
return 0;
}
My code is below, and incomplete.
rdint %eax # reading a
rdint %ebx # reading b
irmovl $1, %edi
subl %eax,%ebx
addl %ebx, %edi
je Equal
irmov1 %eax, %efx #flagged as invalid line
irmov1 %ebx, %egx
irmov1 %ecx, %ehx
irmovl $0, %eax
irmovl $0, %ebx
irmovl $0, %ecx
addl %eax, %efx #flagged as invalid line
addl %ebx, %egx
addl %ecx, %ehx
halt
Basically I think it is mostly done, but I have commented next to two lines flagged as invalid when I try to run it, but I'm not sure why they are invalid. I'm also not sure how to do an if statment for a > b. I could use any suggestions from people who know about y86 assembly language.
From what I can find online (1, 2), the only supported registers are: eax, ecx, edx, ebx, esi, edi, esp, and ebp.
You are requesting non-existent registers (efx and further).
Also irmov is for moving an immediate operand (read: constant numerical value) into a register operand, whereas your irmov1 %eax, %efx has two register operands.
Finally, in computer software there's a huge difference between the character representing digit "one" and the character representing letter "L". Mind your 1's and l's. I mean irmov1 vs irmovl.
Jens,
First, Y86 does not have any efx, egx, and ehx registers, which is why you are getting the invalid lines when you pour the code through YAS.
Second, you make conditional branches by subtracting two registers using the subl instruction and jumping on the condition code set by the Y86 ALU by ways of the jxx instructions.
Check my blog at http://y86tutoring.wordpress.com for details.

Resources