ATT Assembly to C

ATT Assembly to C - c

Working on learning assembler and I have the following code I have to translate to C:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
imull $886836204, %eax, %edx
movl 8(%ebp), %eax
addl %edx, %eax
addl $629084528, %eax
popl %ebp
ret
I know that it takes two arguments and is in the format int func(int1, int2) {} and it returns something from the addition and multiplication lines. Other than that I'm lost. What does this look like in C?

I'd say int func(int a, int b){return b*886836204+a+629084528;}

I'm pretty sure it's a one line return because it doesnt allocate memory on stack for local variables , but the logic seens to be like this:
int function(int x, int y){
int eax;
int edx;
eax = x;
edx = 886836204 * eax;
eax = y;
eax = eax + edx;
eax = eax + 629084528;
return eax;
}
Something like this:
return x * 886836204 + (x + y) + 629084528;

Related

Sign extension in assembly IA32

I'm new to assembly and I'm using IA32 architecture.
I'm trying code a .s function that produces the following operation: C + A - D + B
A is an 8-bit variable
B is a 16-bit variable
C and D are both 32-bit variables
The function should return a 64-bit value that must be printed in C and I can't figure this out.
I am trying this and it works for the tests of positive numbers, but when I'm working with negative numbers it doesn't work and I can't figure out the reason.
My function sum_and_subtract.s:
.section .data
.global A
.global B
.global C
.global D
.section .text
.global sum_and_subtract
# short sum_and_subtract(void)
sum_and_subtract:
#prologue
pushl %ebp
movl %esp, %ebp
pushl %ebx
#body of the function
movl $0, %eax # clear eax
movl C, %eax
movl $0, %ecx
movb A, %cl
addl %ecx, %eax
movl $0, %edx
movl D, %edx
subl %edx, %eax
movl $0, %ebx
movw B, %bx
addl %ebx, %eax
movl $0, %edx
adcl $0, %edx
cdq
#epilogue
fim:
popl %ebx
movl %ebp, %esp
popl %ebp
ret
An correct example is:
A = 0, B = 1, C = 0, D = 0; Expected = 1 -> Result = 1 and It works for this example
The error appears when:
A = 0, B = 0, C = 0, D = 1; Expected = -1 -> Result = 256
After seeing your comments I forgot to write my main code where I print my long long result.
main.c :
#include <stdio.h>
#include "sum_and_subtract.h"
char A = 0;
short B = 0;
long C = 0;
long D = 1;
int main(void) {
printf("A = %d\n", A);
printf("B = %hd\n", B);
printf("C = %ld\n", C);
printf("D = %ld\n", D);
long long result = sum_and_subtract();
printf("Result = %lld\n", result);
return 0;
}
Here it is.
I have this other file sum_and_subtract.h
long long sum_and_subtract(void);

I would follow the C compiler:
doit:
movsbl A(%rip), %eax
movswl B(%rip), %edx
addl C(%rip), %eax
subl D(%rip), %eax
addl %edx, %eax
cltq
ret
movsx eax, BYTE PTR A[rip]
movsx edx, WORD PTR B[rip]
add eax, DWORD PTR C[rip]
sub eax, DWORD PTR D[rip]
add eax, edx
cdqe
ret
It "prints" -1
It is rather a comment so do not UV. Feel free to DV
The complete code with print: https://godbolt.org/z/3a9YMo
And with the printing code: https://godbolt.org/z/KT8YWT

Explanation of array accessing in X86 assembly

I have the following C function:
int sum_arr(int b[], int size){
int counter = size-1;
int res = 0;
while(counter >= 0){
res = res + b[counter];
counter = counter - 1;
}
return res;
}
From which I generated the following assembly code using:
gcc -Og -S file.c
Out came the following assembly code (I have included the parts of interest only):
sum_arr:
.LFB41:
.cfi_startproc
subl $1, %esi
movl $0, %eax
jmp .L2
.L3:
movslq %esi, %rdx
addl (%rdi,%rdx,4), %eax
subl $1, %esi
.L2:
testl %esi, %esi
jns .L3
rep ret
.cfi_endproc
I am having some trouble with .L3. The way I understand it is that it starts off by moving the int counter from a 32 bit register %esi into a 64 bit register %rdx. Then I don't understand the following line:
addl (%rdi,%rdx,4), %eax
in particluar the (%rdi,%rdx,4) part, which gets added to the value in the %eax register.
And on the last line it decrements the counter with 1.
Could someone help me out with that part?

.L3:
movslq %esi, %rdx /* sign extend counter<%esi> to 64bit %rdx */
addl (%rdi,%rdx,4), %eax /* res<%eax> += b<%rdi>[counter<%rdx>]; */
subl $1, %esi /* counter<%esi> -= 1 */
.L2:
testl %esi, %esi /* do counter<%esi> & counter<%esi> */
jns .L3 /* if result is no 0, jump to L3 */
Basically addl (%rdi,%rdx,4), %eax is where you access the array (%rdi) with the index of counter (%rdx) and add the value of the element to res (%eax), the 4 is just the multiply of counter (%rdx) for the memory access as each address in the int array consume 4 bytes in memory in your system.
The line basically says res += MEMORY[addrssOf(b) + counter*4]
BTW, I believe you want to check that size > 0 before line int counter = size-1;, and also as P__J__ mentioned in his answer, your res can overflow as it have the same type of each element in the array you summing.

in this form it is easier to understand:
sum_arr:
sub esi, 1
js .L4
movsx rsi, esi
mov eax, 0
.L3:
add eax, DWORD PTR [rdi+rsi*4]
sub rsi, 1
test esi, esi
jns .L3
ret
.L4:
mov eax, 0
ret
Two remarks:
your integer is very likely to overflow so you should use long long as temporary & return value. It can be shortened as well
long long sum_arr(const int *b, size_t size){
long long res = 0;
while(size--){
res = res + *b++;
}
return res;
}

Why does GCC move variables to a temporary location before assigning them?

When looking at some decompiled C code I saw this:
movl -0xc(%rbp), %esi
movl %esi, -0x8(%rbp)
This corresponds to this C code:
x = y;
This got me thinking: how come gcc moves y to %esi and then move %esi to x instead of just moving y to x directly?
This is the entire C and decompiled code, if it matters:
C
int main(void) {
int x, y, z;
while(1) {
x = 0;
y = 1;
do {
printf("%d\n", x);
z = x + y;
x = y;
y = z;
} while(x < 255);
}
}
Decompiled
pushq %rbp
movq %rsp, %rbp
subq $0x20, %rsp
movl $0x0, -0x4(%rbp)
movl $0x0, -0x8(%rbp) ; x = 0
movl $0x1, -0xc(%rbp) ; y = 1
; printf
leaq 0x56(%rip), %rdi
movl -0x8(%rbp), %esi
movb $0x0, %al
callq 0x100000f78
; z = x + y
movl -0x8(%rbp), %esi ; x -> esi
addl -0xc(%rbp), %esi ; y + esi
movl %esi, -0x10(%rbp) ; z = esi
; x = y
movl -0xc(%rbp), %esi
movl %esi, -0x8(%rbp)
; y = z
movl -0x10(%rbp), %esi
movl %esi, -0xc(%rbp)
movl %eax, -0x14(%rbp) ; not sure... I believe printf return value?
cmpl $0xff, -0x8(%rbp) ; x < 255
jl 0x100000f3d ; do...while(x < 255)
jmp 0x100000f2f ; while(1)

Most x86 instructions (other than some specialized instructions such as movsb) can only access one memory location. Therefore a move from memory to memory requires going through a register with two mov instructions.
The mov instruction can be used in the following ways:
mov mem, reg
mov reg, mem
mov reg, reg
mov reg, imm
mov mem, imm
There is no mov mem, mem.
Note that if you had compiled with optimizations, the variables would be placed in registers so this wouldn't be an issue.

Factorial Function in Assembly

So I'm trying to create a factorial function in assembler
In c:
#include<stdio.h>
int fat (int n)
{
if (n==0) return 1;
else return n*fat(n-1);
}
int main (void){
printf("%d\n", fat(4));
return 0;
}
In Assembly:
.text
.global fat
fat:push %ebp
mov %esp, %ebp
movl $1,%eax
movl 4(%ebp),%edx
LOOP:cmp $0,%edx
je FIM
sub $1,%edx
push %edx
call fat
imul %edx,%eax
FIM:mov %ebp, %esp
pop %ebp
ret
I keep getting the segmentation fault error and I don't know why...can someone help me?

The offset is probably wrong in this line:
movl 4(%ebp),%edx
The stack has the previous value of %ebp and the return address already, so your offset is going to have to be more than 4.
I recommend stepping through the assembly code with the debugger, and make sure that all the register values are exactly what you expect them to be. You will also have problems with the %edx register across calls unless you save and restore its value, too.

fat:push %ebp
mov %esp, %ebp
movl $1,%eax
movl 4(%ebp),%edx /* Must be 8(%ebp) because of the return address! */
LOOP:cmp $0,%edx
je FIM
sub $1,%edx
push %edx
call fat /* The call to fat() just trashed edx, oops. Gotta save/restore it! */
imul %edx,%eax /* The result will be in edx, but you need to return it in eax! */
/* Why isn't "push %edx" compensated here with "pop" or "addl $4,%esp"??? */
FIM:mov %ebp, %esp
pop %ebp
ret
Rewriting your C function, assemblyish style, may be helpful:
int fat (int n)
{
int eax, edx, savedEdx;
eax = 1;
edx = n; /* n = %8(%ebp) */
if (edx == 0)
goto done;
savedEdx = edx; /* can do this with pushl %edx */
--edx;
eax = fat(edx); /* pushl %edx; call fat; addl $4, %esp or popl %edx */
edx = savedEdx; /* popl %edx */
eax *= edx; /* can do this with imul %edx */
done:
return eax;
}

Interpret Assembly Code

I found the following assembly code and I have no idea what it is supposed to be doing (mainly because cmovg follows the movl instruction ):
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %edx
movl %edx, %eax
sarl $31, %eax
testl %edx, %edx
movl $1, %edx
cmovg %edx, %eax
popl %ebp
ret
So here is how I have interpreted it so far:
pushes onto stack
a new pointer (stack pointer) creates to point at the same location as base pointer
gets the input (let's call it x)
copies x into register %eax (res = x)
res = res >> 31 sign extension
tests x
sets x = 1
if >, res = x
restores pointer
returns res
However, I am not sure what the significance of this subroutine is. To me it seems useless. I would appreciate it if you could point out what is being done here.

This code returns the sign of X. In C:
int sign(int x) {
if (x>0)
return 1;
else if (x==0)
return 0;
else
return -1;
}
The instruction sarl $31, %eax will put -1 in eax if it was negative, or 0 otherwise. Then the cmovg instruction will replace this value with 1 if x was positive.