I'm trying to learn some assembly.
My goal is to create an external assembly function that is able to read an array of char, cast to int and then execute various operation, just to learn something.
I've done many proofs but i think i'm missing the point
code:
#include <stdio.h>
#define SIZE 5
extern int foo(char array[]);
int main(void){
char array[SIZE]={'0','1','1','0','1'};
printf("GAS said: %c\n", foo(array));
return 0;
}
assembly:
.data
.text
.global foo
foo:
pushl %ebp
movl %esp, %ebp
movl 8(%esp), %eax #saving in eax the pointer of the array
movl (%eax), %eax #saving in eax the first char of the array
popl %ebp
ret
The strange thing for me is here:
when i use, like in this case
printf("GAS said: %c\n", foo(array));
The output is, as expected, GAS said: 0
Based on this, i was expecting also that changing with:
printf("GAS said: %i\n", foo(array));
will output GAS said: 48 but instead i get in return some random address.
Also, in the assembly file, i can't explain why if i try to
cmpl $48, %eax
je LABEL
the jump will never happen.
The only thing i can think of is that there is a problem with the size, since int takes 4B and char only 1B but i'm not so sure.
So, how can i use compare and return an int to main in this case?
Related
I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):
#include <stdio.h>
#include <stdlib.h>
int addition(int a, int b);
int main(void) {
int sum = addition(3, 4);
printf("%d", sum);
return EXIT_SUCCESS;
}
Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
# Push the current EBP (base pointer) to the stack, so that we
# can reset the EBP to its original state after the function's
# execution
push %ebp
# Move the EBP (base pointer) to the current position of the ESP
# register
movl %esp, %ebp
# Read in the parameters of the addition function
# addition(a, b)
#
# Since we are pushing to the stack, we need to obtain the parameters
# in reverse order:
# EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
#
# Utilize advanced indexing in order to obtain the parameters, and
# store them in the CPU's registers
movzbl 8(%ebp), %ebx
movzbl 12(%ebp), %ecx
# Clear the EAX register to store the sum
xorl %eax, %eax
# Add the values into the section of memory storing the return value
addl %ebx, %eax
addl %ecx, %eax
I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).
Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.
You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.
movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.
You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.
I would write:
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
push %ebp
movl %esp, %ebp
# load first argument
movl 8(%ebp), %eax
# add second argument
addl 12(%ebp), %eax
# epilogue
movl %ebp, %esp # redundant since we haven't touched esp, but will be needed in more complex functions
pop %ebp
ret
In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to
.text
.global addition
addition:
movl 4(%esp), %eax
addl 8(%esp), %eax
ret
You are corrupting the stacke here:
movb %al, 4(%ebp)
To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.
Also EBX must be saved if you intend to use it, but you don't really need it anyway.
I compile come simple code with intel icc compiler, and I notice that there are some numbers at the end of each line. I wanna know the meaning.
Just like #3.12 in the following code.
#include <stdio.h>
int main() {
int a = 3, b;
scanf("%d", &b);
a = a + b;
printf("Hello, world! I am %d\n", a);
return 0;
}
...
main:
..B1.1: # Preds ..B1.0
# Execution count [1.00e+00]
..L1:
#3.12
pushl %ebp #3.12
movl %esp, %ebp #3.12
andl $-128, %esp #3.12
...
It is indeed the line and column of the corresponding source code. #3.12 is the opening { of the main function which makes sense since the shown statements are consistent with the start of a function.
If you insert an extra space before the { you will see that the output changes to #3.13; likewise the 3 changes to 4 if you insert an empty line before the main()function.
This is the procedure for preparing the start of a function, also called the function header. Here we hide the return address on the stack and allocate empty space on the stack for the function to work. Pay attention at the end is the reverse process. Here is an example of the same from another compiler:
push ebp
mov ebp, esp
sub esp, 8
...
mov esp, ebp
pop ebp
ret 0
I'm new to assembly programming and, as a part of a bigger program I have need to pass floating point values to another C-function. I have a call from my test program to my assembly function, that only pushes the parameters on the right stack, and calls a second C function.
My C test function:
extern void ext_func(char *result, double d); // C function
extern double tester(char *str, float d);
double a = tester(str, 3.14)
printf("%s\n", str); // Resulting in '0.000000'
// doing some fancy stuff with the float value and puts in result
ext_func(str, 3.14); // gives str = "3.140000"
x86, gcc -m32:
.globl tester
tester:
pushl %ebp # Standard
movl %esp, %ebp #
flds 12(%ebp) # Push second parameter on stack
pushl 8(%ebp)
call ext_func
addl $4, %esp
leave
ret
I think theres a problem with me only pushing 32 bit when ext_funct expecting double. But I tried to the fldl, fld1, fildl, fldl 12 and 16(%ebp), and some of the other for "fun".
My first question is, are ext_func missing some data on the float stack(ST), and is therefore not able to make the float value?(I understand you dont have the callee function, but doesnt matter what the function does?)
Second, does the compiler allways go to to the f-stack to get float values if it expects them, or is it possible to read them from the memorystack?
Third, is there seomething else I'm missing here? If I
printf("%f", a); //3.140000
printf("%f", str); //3.140000
but the other way a gives big negativ number(100 digits or so) ended by 000000.
The 32 bit convention uses the cpu stack to pass floating point arguments. It only uses the fpu stack for returning them. Yes, you should convert your 32 bit float to a 64 bit double, as per the prototypes you provided.
Note that ext_func is void, that is it doesn't return anything, but you declared tester as returning double ... it's unclear what you want returned, I will assume you want the original d back (for whatever reason).
As such, a possible implementation could be:
.globl tester
tester:
subl $12, %esp # allocate space for outgoing arguments
movl 16(%esp), %eax # fetch our first argument (str)
movl %eax, (%esp) # store as first outgoing argument
flds 20(%esp) # Fetch our second argument as float
fstpl 4(%esp) # store it as second outgoing argument as double
call ext_func
flds 20(%esp) # load d as return value
addl $12, %esp # cleanup stack
ret
I need to pass an address to an assembly function, but seems like I'm not able to do that.
Here's the c file:
int asm_func(void *arg);
struct foo {
int len;
char *buf;
};
int bar(int size, char *buf){
struct foo arg_to_asm_function;
arg_to_asm_function.len = size;
arg_to_asm_function.buf = buf;
return asm_func(&arg_to_asm_function);
}
Here's the assembly:
.global asm_func
asm_func:
pushl %esi
movl 8(%ebp), %esi
/* do something with &arg_to_asm_function, which is in esi */
popl %esi
ret
If I invoke the c function bar with arguments bar(5, "hello world"), and I stepi into the instruction
movl 8(%ebp), %esi
I get the value 5 in %esi (value of first field in the struct foo).
The expected value in %esi is the pointer to the struct foo that I declared, i.e. &arg_to_asm_function, not the value inside that address.
Why is this happening? Does the compiler automatically dereference the pointer for me? How would I pass in the address of the struct into %esi?
You didn't set up the stack frame in the assembly function, so 8(%ebp) won't give you the correct value. Because ebp still has the value from your C function, you're seeing the value of the first argument passed to that function instead.
You need to set up the stack frame with
push %ebp
mov %esp, %ebp
...
pop %ebp
This is assuming that the calling convention passes the function parameters on the stack - otherwise you'll need to get the parameter value from a register.
I'm writing a function in ASM which is supposed to copy the (constant) value 2 into every index of an array declared in .data. My code compiles, but I don't get any output through my C program. Here's the code:
.globl my_func
.globl _my_func
my_func:
_my_func:
movl %esp,%ebp
pushl %ebp
movl $0,%ecx
leal array,%eax
jmp continue
continue:
_continue:
movl $2,array(%ecx,4)
cmpl $1024,%ecx
jne incr
je finish
incr:
_incr:
addl $4,%ecx
jmp continue
finish:
_finish:
popl %ebp
ret
.data
.align 4
array: .fill 1024
It is called from here:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
extern int* my_func();
int main(int argc, const char * argv[])
{
int i = 0;
int* a = my_func();
for(i = 0; i < 1024/4; i++){
printf("%d\n", a[i]);
}
return 0;
}
As mentioned, the program does compile and run, but the main function does not output anything to the terminal. And yes, I know the code isn't optimal -- I'm currently following an introductory course in computer architecture and ASM, and I'm just checking out instructions and data.
I am assembling the code for IA32 on an Intel Mac with OSX10.9, using LLVM5.1
Thanks in advance.
The function prologue where you save the previous frame pointer and set it up for the new stack frame should be:
pushl %ebp
movl %esp,%ebp
Yours is in the opposite order, so when your function returns the caller's frame pointer will be incorrect.
return values are normally in eax, so you need to set eax to the address of the start of the memory you want to return in finish.
fyi: you shouldn't need to declare your label twice, the leading underscore is only needed for public functions you want to access from C