Storing Local variables in Assembly - c

So I've been working on a problem (and before you ask, yes, it is homework, but I've been putting in faithful effort!) where I have some assembly code and want to be able to convert it (as faithfully as possible) to C.
Here is the assembly code:
A1:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
jmp .L2
.L4:
movl -4(%ebp), %eax
sall $2, %eax
addl 8(%ebp), %eax
movl (%eax), %eax
cmpl 12(%ebp), %eax
jg .L6
.L2:
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L4
jmp .L3
.L6:
nop
.L3:
movl -4(%ebp), %eax
leave
ret
And here's some of the C code I wrote to mimic it:
int A1(int a, int b, int c) {
int local = 0;
while(local < c) {
if(b > (int*)((local << 2) + a)) {
return local;
}
}
return local;
}
I have a few questions about how assembly works.
First, I notice that in L4, the body of the while loop, nothing is ever assigned to local. It's initialized to be 0 at the start of the function, and then never modified again. Looking at the C code I made for it, though, that seems odd, considering that the loop will go on indefinitely if the if-condition fails. Am I missing something there? I was under the impression that you'd need a snippet of code like:
movl %eax, -4(%ebp)
in order to actually assign anything to the local variable, and I don't see anything like that in the body of the while loop.
Secondly, you'll see that in the assembly code, the only local variable that's declared is "local". Hence, I have to use a snippet of code like:
if(b > (int*)((local << 2) + a))
The output of this line doesn't look much like the assembly code, though, and I think I might have made a mistake. What did I do wrong here?
And finally (thanks for your patience!), on a related note, I understand that the purpose of this if-loop in the while loop is to break out if the condition is fulfilled, and then to return local. Hence L6 and "nop" (which is basically saying nothing). However, I don't know how to replicate this in my program. I've tried "break", and I've tried returning local as you see here. I understand the functionality - I just don't know how to replicate it in C (short of using goto, but that kind of defeats the purpose of the exercise...).
Thank you for your time!

This is my guess:
int A1 (int *a, int value, int size)
{
int i = 0;
while (i<size)
{
if (a[i] <= value)
break;
}
return i;
}
Which, compiled back to assembly, gives me this code:
A1:
.LFB0:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
jmp .L2
.L4:
movl -4(%ebp), %eax
leal 0(,%eax,4), %edx
movl 8(%ebp), %eax
addl %edx, %eax
movl (%eax), %eax
cmpl 12(%ebp), %eax
jg .L2
jmp .L3
.L2:
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L4
.L3:
movl -4(%ebp), %eax
leave
ret
Now this seems to be identical to your original ASM code, just the code starting at L4 is not the same, but if we anotate both codes:
ORIGINAL
movl -4(%ebp), %eax ;EAX = local
sall $2, %eax ;EAX = EAX*4
addl 8(%ebp), %eax ;EAX = EAX+a, hence EAX=a+local*4
ASM-C-ASM
movl -4(%ebp), %eax ;EAX = i
leal 0(,%eax,4), %edx ;EDX = EAX*4
movl 8(%ebp), %eax ;EAX = a
addl %edx, %eax ;EAX = EAX+EDX, hence EAX=a+i*4
Both codes continue with
movl (%eax), %eax
Because of this, I guess a is actually a pointer to some variable type that uses 4 bytes. By the comparison between the second argument and the value read from memory, I guess that type must be either int or long. I choose int solely by convenience.
Of course this also means that this code (and the original one) does not make any sense. It lacks the i++ part somewhere. If this is so, then a is an array, and the third argument is the size of the array. I've named my local variable i to keep with the tradition of naming index variables like this.
This code would scan the array searching for a value inside it that is equal or less than value. If it finds it, the index to that value is returned. If not, the size of the array is returned.

Related

Update array in assembly

Just starting to learn assembly, so I am having trouble with some basic. For example, I am trying to modify a file that normally runs to find the maximum value.
What I am trying to do is iterate through the data_items and modify it by changing the list item, while still maintaining its ability to execute to find the maximum value.
I have tried modifying the start_loop to include an addl data_items, 0x01 but it errors out since it only pulls the first value.
I then adjusted it to do something like addl data_items(,%esi,4), 0x01 and iterate through adjusting it but that didn't work either. I know this is probably very simplistic, but I can't seem to find a method that works.
#
.section .data
data_items: #These are the data items trying to modify and store
.long 3,67,34,222,45,75,54,34,44,33,22,11,66,0
.section .text
.globl _start
_start:
movl $0, %edi
movl data_items(,%edi,4), %eax
movl %eax, %ebx
start_loop:
cmpl $0, %eax
je loop_exit
incl %edi
movl data_items(,%edi,4), %eax
cmpl %ebx, %eax
jle start_loop
movl %eax, %ebx
jmp start_loop
loop_exit:
movl $1, %eax
int $0x80
Once you have an array element in %eax just increment the memory that was last used to read %eax from:
start_loop:
cmpl $0, %eax ; Not actual element if this is 0
je loop_exit
incl data_items(,%edi,4) <<<<<<<<<<<<<<<<<
incl %edi
movl data_items(,%edi,4), %eax
This does not mess with the current finding of the max value. Only next time will the max value be 1 greater.

Calling C function from the x86 assembler

I am trying to write a function that converts decimal numbers into binary in assembler. Since printing is so troublesome in there, I have decided to make a separate function in C that just prints the numbers. But when I run the code, it always prints '0110101110110100'
Heres the C function (both print and conversion):
void printBin(int x) {
printf("%d", x);
}
void DecToBin(int n)
{
// Size of an integer is assumed to be 16 bits
for (int i = 15; i >= 0; i--) {
int k = n >> i;
printBin(k & 1);
}
heres the code in asm:
.globl _DecToBin
.extern _printBin
_DecToBin:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp),%eax
movl $15, %ebx
cmpl $0, %ebx
jl end
start:
movl %ebx, %ecx
movl %eax, %edx
shrl %cl, %eax
andl $1, %eax
pushl %eax
call _printBin
movl %edx, %eax
dec %ebx
cmpl $0, %ebx
jge start
end:
movl %ebp, %esp
popl %ebp
ret
Cant figure out where the mistake is. Any help would be appreciated
disassembled code using online program
Your principle problem is that it is very unlikely that %edx is preserved across the function call to printBin.
Also:
%ebx is not a volatile register in most (any?) C calling convention rules. You need to check your compilers documentation and conform to it.
If you are going to use ebx, you need to save and restore it.
The stack pointer needs to be kept aligned to 16 bytes. On my machine (macos), it SEGVs under printBin if you don’t.

Loop Assembly x86

I am a beginner in Assembly(x86 ATT Syntax). I am working on an assignment where I have to go through each index of a 2d array and find the number of 1's. The method takes in 2d int array,int w, int h. How do I implement the loop to go go from 0 to w and bounce out with loop instruction. I know how to do it with a jmp instruction but loop just gives me errors/segFaults. This is my attempt with a jump statement and it works. How would I go about converting the inner loop using a loop instruction?
pushl %ebp
movl %esp, %ebp
movl $0, -4(%ebp)
movl $0, -8(%ebp)
movl $0, -12(%ebp)
movl $0, -4(%ebp) #outside loop
jmp .L10
.L14: #Inner Loop
movl $0, -8(%ebp)
jmp .L11
.L13:
movl -4(%ebp), %eax
leal 0(,%eax,4), %edx
movl 8(%ebp), %eax
addl %edx, %eax
movl (%eax), %eax
movl -8(%ebp), %edx
sall $2, %edx
addl %edx, %eax
movl (%eax), %eax
cmpl $1, %eax
jne .L12
addl $1, -12(%ebp)
.L12:
addl $1, -8(%ebp)
.L11: #check inner loop
movl -8(%ebp), %eax
cmpl 12(%ebp), %eax
jl .L13
addl $1, -4(%ebp)
.L10: #check outside loop
movl -4(%ebp), %eax
cmpl 16(%ebp), %eax
jl .L14
movl -12(%ebp), %eax
leave
ret
Generally using loop has no advantages except maybe smaller code. It's usually slower and less flexible, thus not recommended.
That said, if you still want to use it you should know that it employs the ecx register for counting down to zero. So you need to restructure your code to accommodate that. In your case, that means loading ecx with the value of w and letting it count down. You will also need to apply an offset of -1 during indexing, since your current loop variable goes from 0 to w-1, but ecx will go from w down to 1 (inclusive).
Furthermore, the loop instruction is used after the loop body, that is it implements a do-while loop. To skip the loop body if the count is zero a companion instruction, JECXZ, can be used.
You can use lodsl (which moves %esi up by default) and loop (which moves %ecx down) in tandem. I am not sure if it is more efficient than what gcc generates from c, which is basically your code, but it looks prettier.
What I have done here doesn't answer your question precisely -- rather than use loop for the inner loop I have assumed the whole array is stored contiguously and then there is only a single loop to worry about. When compiling from c on my machine, it is stored contiguously, but I'm not sure you should rely on that. Hopefully what I have done gives you enough to understand how loop and lodsl work, and you can modify your code to use them just in the inner loop.
.data
x:
.long 6
y:
.long 5
array:
.long 1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1
.text
.global _start
_start:
# set up for procedure call
push x
push y
push $array
# call and cleanup
call cnt
add $0xc, %esp
# put result in %ebx and finish up
#(echo $? gives value in bash if <256)
mov %eax, %ebx
mov $1, %eax
int $0x80
# %ebx will hold the count of 1s
# %ecx will hold the number of elements to check
# %esi will hold the address of the first element
# Assumes elements are stored contiguously in memory
cnt:
# do progogue
enter $0, $1
# set %ebx to 0
xorl %ebx, %ebx
# grab x and y parameters from stack and
# multiply together to get the number of elements
# in the array
movl 0x10(%ebp), %eax
movl 0xc(%ebp), %ecx
mul %ecx
movl %eax, %ecx
# get address of first element in array
movl 0x8(%ebp), %esi
getel:
# grab the value at the address in %esi and increment %esi
# it is put in %eax
lodsl
# if the value in %eax is 1, increment %ebx
cmpl $1, %eax
jne ne
incl %ebx
ne:
# decrement %ecx and if it is greater than 0, keep going
loop getel
# %ecx is zero so we are done. Put the count in %eax
movl %ebx, %eax
# do epilogue
leave
ret
It really is prettier without all the comments.
.data
x:
.long 6
y:
.long 5
array:
.long 1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1
.text
.global _start
_start:
push x
push y
push $array
call cnt
add $0xc, %esp
mov %eax, %ebx
mov $1, %eax
int $0x80
cnt:
enter $0, $1
xorl %ebx, %ebx
movl 0x10(%ebp), %eax
movl 0xc(%ebp), %ecx
mul %ecx
movl %eax, %ecx
movl 0x8(%ebp), %esi
getel:
lodsl
cmpl $1, %eax
jne ne
incl %ebx
ne:
loop getel
movl %ebx, %eax
leave
ret

I'm trying to interpret this IA32 assembly language code

I have this IA32 assembly language code I'm trying to convert into regular C code.
.globl fn
.type fn, #function
fn:
pushl %ebp #setup
movl $1, %eax #setup 1 is in A
movl %esp, %ebp #setup
movl 8(%ebp), %edx # pointer X is in D
cmpl $1, %edx # (*x > 1)
jle .L4
.L5:
imull %edx, %eax
subl $1, %edx
cmpl $1, %edx
jne .L5
.L4:
popl %ebp
ret
The trouble I'm having is deciding what type of comparison is going on. I don't get how the program gets to the L5 cache. L5 seems to be a loop since there's a comparison within it. I'm also unsure of what is being returned because it seems like most of the work is done is the %edx register, but doesn't go back to %eax for returning.
What I have so far:
int fn(int x)
{
}
It looks to me like it's computing a factorial. Ignoring the stack frame manipulation and such, we're left with:
movl $1, %eax #setup 1 is in A
Puts 1 into eax.
movl 8(%ebp), %edx # pointer X is in D
Retrieves a parameter into edx
imull %edx, %eax
Multiplies eax by edx, putting the result into eax.
subl $1, %edx
cmpl $1, %edx
jne .L5
Decrements edx and repeats if edx != 1.
In other words, this is roughly equivalent to:
unsigned fact(unsigned input) {
unsigned retval = 1;
for ( ; input != 1; --input)
retval *= input;
return retval;
}

Deeper function profiling/emulation

Hello everyone
Merry Christmas
I need an advice I have the following code:
int main()
{
int k=5000000;
int p;
int sum=0;
for (p=0;p<k;p++)
{
sum+=p;
}
return 0;
}
When I assemble it I get
main:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $5000000, -4(%ebp)
movl $0, -12(%ebp)
movl $0, -8(%ebp)
jmp .L2
.L3:
movl -8(%ebp), %eax
addl %eax, -12(%ebp)
addl $1, -8(%ebp)
.L2:
movl -8(%ebp), %eax
cmpl -4(%ebp), %eax
jl .L3
movl $0, %eax
leave
ret
If I run it through gprof I get that main executed the most, which is quite obvious!
Yet I want to go a step further and be able to know if L2, or L3 executed the most. here it is obvious that L3 executed the most. yet is there some kind of profiler, emulator that can give me that data for an entire code?
Well, if you don't mind low-tech, you can answer any such question either by single-stepping, or by this way.
For the latter method, it doesn't matter how big or complicated your program is.

Resources