I'm reviewing a practice midterm at the moment. the question gives a piece of assembly code (IA32) and instructs to write the C equivalent of it. Just want to make sure I'm doing it correctly. Thanks!
Assembly program given:
.global _someOperation
_someOperation:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %ebx
movl 12(%ebp), %edx
decl %edx
xorl %esi, %esi
movl (%ebx, %esi, 4), %eax
continue:
incl %esi
cmpl (%ebx, %esi, 4), %eax
jl thelabel
movl (%ebx, %esi, 4), %eax
thelabel:
cmp %esi, %edx
jne continue
movl %ebp, %esp
popl %ebp
ret
This is the code I've written:
void someOperation(int *num, int count) //Given
{
int k; //Given
count--;
int i = 0;
k = num[i];
i++;
while(count != i)
{
if(k >= num[i]
k = num[i];
i++;
}
return (k);
}
Looks pretty close to me, although in the ASM the increment is only at the beginning of the loop, and the condition is not checked the first time through. Consider using DO...WHILE instead.
EDIT: also, your assignment is wrong. MOV instruction copies from the 2nd parameter to the first. You have it going the other way in your C code.
Related
I've been trying to translate this function to assembly:
void foo (int a[], int n) {
int i;
int s = 0;
for (i=0; i<n; i++) {
s += a[i];
if (a[i] == 0) {
a[i] = s;
s = 0;
}
}
}
But something is going wrong.
That's what I've done so far:
.section .text
.globl foo
foo:
.L1:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl $0, -16(%rbp) /*s*/
movl $0, -8(%rbp) /*i*/
jmp .L2
.L2:
cmpl -8(%rbp), %esi
jle .L4
leave
ret
.L3:
addl $1, -8(%rbp)
jmp .L2
.L4:
movl -8(%rbp), %eax
imull $4, %eax
movslq %eax, %rax
addq %rdi, %rax
movl (%rax), %eax
addl %eax, -16(%rbp)
cmpl $0, %eax
jne .L3
/* if */
leaq (%rax), %rdx
movl -16(%rbp), %eax
movl %eax, (%rdx)
movl $0, -16(%rbp)
jmp .L3
I am compiling the .s module with a .c module, for example, with an int nums [5] = {65, 23, 11, 0, 34} and I'm getting back the same array instead of {65, 23, 11 , 99, 34}.
Could someone help me?
Presumably you have a compiler that can generate AT&T syntax. It might be more instructive to look at what assembly output the compiler generates. Here's my re-formulation of your demo:
#include <stdio.h>
void foo (int a[], int n)
{
for (int s = 0, i = 0; i < n; i++)
{
if (a[i] != 0)
s += a[i];
else
a[i] = s, s = 0;
}
}
int main (void)
{
int nums[] = {65, 23, 11, 0, 34};
int size = sizeof(nums) / sizeof(int);
foo(nums, size);
for (int i = 0; i < size; i++)
fprintf(stdout, i < (size - 1) ? "%d, " : "%d\n", nums[i]);
return (0);
}
Compiling without optimizations enabled is typically harder to work through than optimized code, since it loads from and spills results to memory. You won't learn much from it if you're investing time in learning how to write efficient assembly.
Compiling with the Godbolt compiler explorer with -O2 optimizations yields much more efficient code; it's also useful for cutting out unnecessary directives, labels, etc., that would be visual noise in this case.
In my experience, using -O2 optimizations are clever enough to make you rethink your use of registers, refactoring, etc. -O3 can sometimes optimize too agressively - unrolling loops, vectorizing, etc., to easily follow.
Finally, for the case you have presented, there's a perfect compromise: -Os, which enables many of the optimizations of -O2, but not at the expense of increased code size. I'll paste the assembly here just for comparative purposes:
foo:
xorl %eax, %eax
xorl %ecx, %ecx
.L2:
cmpl %eax, %esi
jle .L7
movl (%rdi,%rax,4), %edx
testl %edx, %edx
je .L3
addl %ecx, %edx
jmp .L4
.L3:
movl %ecx, (%rdi,%rax,4)
.L4:
incq %rax
movl %edx, %ecx
jmp .L2
.L7:
ret
Remember that the calling convention passes the pointer to (a) in %rdi, and the 'count' (n) in %rsi. These are the calling conventions being used. Notice that your code does not 'dereference' or 'index' any elements through %rdi. It's definitely worth going stepping through the code - even with pen and paper if it helps - to understand the branch conditions and how reading and writing is performed on element a[i].
Curiously, using the inner loop of your code:
s += a[i];
if (a[i] == 0)
a[i] = s, s = 0;
Appears to generate more efficient code with -Os than the inner loop I used:
foo:
xorl %eax, %eax
xorl %edx, %edx
.L2:
cmpl %eax, %esi
jle .L6
movl (%rdi,%rax,4), %ecx
addl %ecx, %edx
testl %ecx, %ecx
jne .L3
movl %edx, (%rdi,%rax,4)
xorl %edx, %edx
.L3:
incq %rax
jmp .L2
.L6:
ret
A reminder for me to keep things simple!
I am trying to write a function that converts decimal numbers into binary in assembler. Since printing is so troublesome in there, I have decided to make a separate function in C that just prints the numbers. But when I run the code, it always prints '0110101110110100'
Heres the C function (both print and conversion):
void printBin(int x) {
printf("%d", x);
}
void DecToBin(int n)
{
// Size of an integer is assumed to be 16 bits
for (int i = 15; i >= 0; i--) {
int k = n >> i;
printBin(k & 1);
}
heres the code in asm:
.globl _DecToBin
.extern _printBin
_DecToBin:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp),%eax
movl $15, %ebx
cmpl $0, %ebx
jl end
start:
movl %ebx, %ecx
movl %eax, %edx
shrl %cl, %eax
andl $1, %eax
pushl %eax
call _printBin
movl %edx, %eax
dec %ebx
cmpl $0, %ebx
jge start
end:
movl %ebp, %esp
popl %ebp
ret
Cant figure out where the mistake is. Any help would be appreciated
disassembled code using online program
Your principle problem is that it is very unlikely that %edx is preserved across the function call to printBin.
Also:
%ebx is not a volatile register in most (any?) C calling convention rules. You need to check your compilers documentation and conform to it.
If you are going to use ebx, you need to save and restore it.
The stack pointer needs to be kept aligned to 16 bytes. On my machine (macos), it SEGVs under printBin if you don’t.
I have a method method(int a, int b) in x86 assembly code:
method:
pushl %ebx
subl $24 , %esp
movl 32(%esp ) , %ebx
movl 36(%esp ) , %edx
movl $1 , %eax
testl %edx , %edx
je .L2
subl $1 , %edx
movl %edx , 4(%esp )
movl %ebx , (%esp )
call method
imull %ebx , %eax
.L2:
addl $24 , %esp
popl %ebx
ret
But I just cant wrap my head around its function.
a is written on %ebx, b is written on %edx.
%eax is initialized with 1.
If %edx is not 0, I substract 1 from %edx and push %edx and %ebx on the stack and call method again. I just dont understand what it does. And isn't it impossible to reach the line imull %ebx, %eax?
I would be really happy, if someone could explain the basic function of this method to me.
It is basically equivalent to following C function:
int method(int a, int b)
{
if (b == 0)
return 1;
return method(a, b-1) * a;
}
or closer to the assembly code:
int method(int a, int b)
{
if (b == 0)
return 1;
int temp = method(a, b-1);
return temp * a; // we do get here when the recursion is over,
// the same ways as we get to imull %ebx, %eax
// in your assembly code when the recursion is over
}
imull %ebx, %eax is reached when the recursive call returns.
The function appears to be calculating the power of the input variables (ab) via recursion and the value is returned via %eax.
The way this works is that the base case is when b is 0, and 1 is returned. When b > 0, method(a, b-1) * a is returned.
So I've been working on a problem (and before you ask, yes, it is homework, but I've been putting in faithful effort!) where I have some assembly code and want to be able to convert it (as faithfully as possible) to C.
Here is the assembly code:
A1:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
jmp .L2
.L4:
movl -4(%ebp), %eax
sall $2, %eax
addl 8(%ebp), %eax
movl (%eax), %eax
cmpl 12(%ebp), %eax
jg .L6
.L2:
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L4
jmp .L3
.L6:
nop
.L3:
movl -4(%ebp), %eax
leave
ret
And here's some of the C code I wrote to mimic it:
int A1(int a, int b, int c) {
int local = 0;
while(local < c) {
if(b > (int*)((local << 2) + a)) {
return local;
}
}
return local;
}
I have a few questions about how assembly works.
First, I notice that in L4, the body of the while loop, nothing is ever assigned to local. It's initialized to be 0 at the start of the function, and then never modified again. Looking at the C code I made for it, though, that seems odd, considering that the loop will go on indefinitely if the if-condition fails. Am I missing something there? I was under the impression that you'd need a snippet of code like:
movl %eax, -4(%ebp)
in order to actually assign anything to the local variable, and I don't see anything like that in the body of the while loop.
Secondly, you'll see that in the assembly code, the only local variable that's declared is "local". Hence, I have to use a snippet of code like:
if(b > (int*)((local << 2) + a))
The output of this line doesn't look much like the assembly code, though, and I think I might have made a mistake. What did I do wrong here?
And finally (thanks for your patience!), on a related note, I understand that the purpose of this if-loop in the while loop is to break out if the condition is fulfilled, and then to return local. Hence L6 and "nop" (which is basically saying nothing). However, I don't know how to replicate this in my program. I've tried "break", and I've tried returning local as you see here. I understand the functionality - I just don't know how to replicate it in C (short of using goto, but that kind of defeats the purpose of the exercise...).
Thank you for your time!
This is my guess:
int A1 (int *a, int value, int size)
{
int i = 0;
while (i<size)
{
if (a[i] <= value)
break;
}
return i;
}
Which, compiled back to assembly, gives me this code:
A1:
.LFB0:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
jmp .L2
.L4:
movl -4(%ebp), %eax
leal 0(,%eax,4), %edx
movl 8(%ebp), %eax
addl %edx, %eax
movl (%eax), %eax
cmpl 12(%ebp), %eax
jg .L2
jmp .L3
.L2:
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L4
.L3:
movl -4(%ebp), %eax
leave
ret
Now this seems to be identical to your original ASM code, just the code starting at L4 is not the same, but if we anotate both codes:
ORIGINAL
movl -4(%ebp), %eax ;EAX = local
sall $2, %eax ;EAX = EAX*4
addl 8(%ebp), %eax ;EAX = EAX+a, hence EAX=a+local*4
ASM-C-ASM
movl -4(%ebp), %eax ;EAX = i
leal 0(,%eax,4), %edx ;EDX = EAX*4
movl 8(%ebp), %eax ;EAX = a
addl %edx, %eax ;EAX = EAX+EDX, hence EAX=a+i*4
Both codes continue with
movl (%eax), %eax
Because of this, I guess a is actually a pointer to some variable type that uses 4 bytes. By the comparison between the second argument and the value read from memory, I guess that type must be either int or long. I choose int solely by convenience.
Of course this also means that this code (and the original one) does not make any sense. It lacks the i++ part somewhere. If this is so, then a is an array, and the third argument is the size of the array. I've named my local variable i to keep with the tradition of naming index variables like this.
This code would scan the array searching for a value inside it that is equal or less than value. If it finds it, the index to that value is returned. If not, the size of the array is returned.
I have this IA32 assembly language code I'm trying to convert into regular C code.
.globl fn
.type fn, #function
fn:
pushl %ebp #setup
movl $1, %eax #setup 1 is in A
movl %esp, %ebp #setup
movl 8(%ebp), %edx # pointer X is in D
cmpl $1, %edx # (*x > 1)
jle .L4
.L5:
imull %edx, %eax
subl $1, %edx
cmpl $1, %edx
jne .L5
.L4:
popl %ebp
ret
The trouble I'm having is deciding what type of comparison is going on. I don't get how the program gets to the L5 cache. L5 seems to be a loop since there's a comparison within it. I'm also unsure of what is being returned because it seems like most of the work is done is the %edx register, but doesn't go back to %eax for returning.
What I have so far:
int fn(int x)
{
}
It looks to me like it's computing a factorial. Ignoring the stack frame manipulation and such, we're left with:
movl $1, %eax #setup 1 is in A
Puts 1 into eax.
movl 8(%ebp), %edx # pointer X is in D
Retrieves a parameter into edx
imull %edx, %eax
Multiplies eax by edx, putting the result into eax.
subl $1, %edx
cmpl $1, %edx
jne .L5
Decrements edx and repeats if edx != 1.
In other words, this is roughly equivalent to:
unsigned fact(unsigned input) {
unsigned retval = 1;
for ( ; input != 1; --input)
retval *= input;
return retval;
}