I'm new to assembly programming and, as a part of a bigger program I have need to pass floating point values to another C-function. I have a call from my test program to my assembly function, that only pushes the parameters on the right stack, and calls a second C function.
My C test function:
extern void ext_func(char *result, double d); // C function
extern double tester(char *str, float d);
double a = tester(str, 3.14)
printf("%s\n", str); // Resulting in '0.000000'
// doing some fancy stuff with the float value and puts in result
ext_func(str, 3.14); // gives str = "3.140000"
x86, gcc -m32:
.globl tester
tester:
pushl %ebp # Standard
movl %esp, %ebp #
flds 12(%ebp) # Push second parameter on stack
pushl 8(%ebp)
call ext_func
addl $4, %esp
leave
ret
I think theres a problem with me only pushing 32 bit when ext_funct expecting double. But I tried to the fldl, fld1, fildl, fldl 12 and 16(%ebp), and some of the other for "fun".
My first question is, are ext_func missing some data on the float stack(ST), and is therefore not able to make the float value?(I understand you dont have the callee function, but doesnt matter what the function does?)
Second, does the compiler allways go to to the f-stack to get float values if it expects them, or is it possible to read them from the memorystack?
Third, is there seomething else I'm missing here? If I
printf("%f", a); //3.140000
printf("%f", str); //3.140000
but the other way a gives big negativ number(100 digits or so) ended by 000000.
The 32 bit convention uses the cpu stack to pass floating point arguments. It only uses the fpu stack for returning them. Yes, you should convert your 32 bit float to a 64 bit double, as per the prototypes you provided.
Note that ext_func is void, that is it doesn't return anything, but you declared tester as returning double ... it's unclear what you want returned, I will assume you want the original d back (for whatever reason).
As such, a possible implementation could be:
.globl tester
tester:
subl $12, %esp # allocate space for outgoing arguments
movl 16(%esp), %eax # fetch our first argument (str)
movl %eax, (%esp) # store as first outgoing argument
flds 20(%esp) # Fetch our second argument as float
fstpl 4(%esp) # store it as second outgoing argument as double
call ext_func
flds 20(%esp) # load d as return value
addl $12, %esp # cleanup stack
ret
Related
I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):
#include <stdio.h>
#include <stdlib.h>
int addition(int a, int b);
int main(void) {
int sum = addition(3, 4);
printf("%d", sum);
return EXIT_SUCCESS;
}
Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
# Push the current EBP (base pointer) to the stack, so that we
# can reset the EBP to its original state after the function's
# execution
push %ebp
# Move the EBP (base pointer) to the current position of the ESP
# register
movl %esp, %ebp
# Read in the parameters of the addition function
# addition(a, b)
#
# Since we are pushing to the stack, we need to obtain the parameters
# in reverse order:
# EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
#
# Utilize advanced indexing in order to obtain the parameters, and
# store them in the CPU's registers
movzbl 8(%ebp), %ebx
movzbl 12(%ebp), %ecx
# Clear the EAX register to store the sum
xorl %eax, %eax
# Add the values into the section of memory storing the return value
addl %ebx, %eax
addl %ecx, %eax
I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).
Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.
You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.
movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.
You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.
I would write:
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
push %ebp
movl %esp, %ebp
# load first argument
movl 8(%ebp), %eax
# add second argument
addl 12(%ebp), %eax
# epilogue
movl %ebp, %esp # redundant since we haven't touched esp, but will be needed in more complex functions
pop %ebp
ret
In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to
.text
.global addition
addition:
movl 4(%esp), %eax
addl 8(%esp), %eax
ret
You are corrupting the stacke here:
movb %al, 4(%ebp)
To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.
Also EBX must be saved if you intend to use it, but you don't really need it anyway.
I'm trying to learn some assembly.
My goal is to create an external assembly function that is able to read an array of char, cast to int and then execute various operation, just to learn something.
I've done many proofs but i think i'm missing the point
code:
#include <stdio.h>
#define SIZE 5
extern int foo(char array[]);
int main(void){
char array[SIZE]={'0','1','1','0','1'};
printf("GAS said: %c\n", foo(array));
return 0;
}
assembly:
.data
.text
.global foo
foo:
pushl %ebp
movl %esp, %ebp
movl 8(%esp), %eax #saving in eax the pointer of the array
movl (%eax), %eax #saving in eax the first char of the array
popl %ebp
ret
The strange thing for me is here:
when i use, like in this case
printf("GAS said: %c\n", foo(array));
The output is, as expected, GAS said: 0
Based on this, i was expecting also that changing with:
printf("GAS said: %i\n", foo(array));
will output GAS said: 48 but instead i get in return some random address.
Also, in the assembly file, i can't explain why if i try to
cmpl $48, %eax
je LABEL
the jump will never happen.
The only thing i can think of is that there is a problem with the size, since int takes 4B and char only 1B but i'm not so sure.
So, how can i use compare and return an int to main in this case?
I'm working on a small compiler project, and I can't seem to figure out how to push the address of a stack location instead of the value at that stack location. My goal is to push a stack location address, that holds an integer value, as a void pointer to a C function that prints it. My ultimate goal is to do some pointer-integer arithmetic in the function. I am successful in calling the C function from a runtime library extension, but the issue is just figuring out how to push the address in assembly.
My C function.
void print(void* ptr){
int* int_ptr = (int*)ptr;
printf("*int_ptr is %d\n",*int_ptr);
}
My Assembly
.globl main
main:
pushl %ebp
movl %esp, %ebp
subl $4, %esp
movl $42, %eax
movl %eax, -4(%ebp)
pushl -4(%ebp)
//Instead of the value at -4(%ebp), I would like the address of -4(%ebp)
call print
addl $8, %esp
leave
ret
As for what I have now, it'll crash since I'm casting the value 42 to an address. Can someone please direct me to some reference or resources to learn more?
In general you can get the address of a stack based value by using the LEA instruction to get the effective address of -4(%ebp) and place it in a register. You can then push that register to the stack. The LEA instruction is described in the instruction set reference this way:
Computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processors addressing modes; the destination operand is a general-purpose register.
In your code something like this would have worked:
lea -4(%ebp), %eax
push %eax
This should effectively pass the address of -4(%ebp) on the stack for use by your function print.
It seems state-of-art compilers treat arguments passed by stack as read-only. Note that in the x86 calling convention, the caller pushes arguments onto the stack and the callee uses the arguments in the stack. For example, the following C code:
extern int goo(int *x);
int foo(int x, int y) {
goo(&x);
return x;
}
is compiled by clang -O3 -c g.c -S -m32 in OS X 10.10 into:
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 10
.globl _foo
.align 4, 0x90
_foo: ## #foo
## BB#0:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl 8(%ebp), %eax
movl %eax, -4(%ebp)
leal -4(%ebp), %eax
movl %eax, (%esp)
calll _goo
movl -4(%ebp), %eax
addl $8, %esp
popl %ebp
retl
.subsections_via_symbols
Here, the parameter x(8(%ebp)) is first loaded into %eax; and then stored in -4(%ebp); and the address -4(%ebp) is stored in %eax; and %eax is passed to the function goo.
I wonder why Clang generates code that copy the value stored in 8(%ebp) to -4(%ebp), rather than just passing the address 8(%ebp) to the function goo. It would save memory operations and result in a better performance. I observed a similar behaviour in GCC too (under OS X). To be more specific, I wonder why compilers do not generate:
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 10
.globl _foo
.align 4, 0x90
_foo: ## #foo
## BB#0:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
leal 8(%ebp), %eax
movl %eax, (%esp)
calll _goo
movl 8(%ebp), %eax
addl $8, %esp
popl %ebp
retl
.subsections_via_symbols
I searched for documents if the x86 calling convention demands the passed arguments to be read-only, but I couldn't find anything on the issue. Does anybody have any thought on this issue?
The rules for C are that parameters must be passed by value. A compiler converts from one language (with one set of rules) to a different language (potentially with a completely different set of rules). The only limitation is that the behaviour remains the same. The rules of the C language do not apply to the target language (e.g. assembly).
What this means is that if a compiler feels like generating assembly language where parameters are passed by reference and are not passed by value; then this is perfectly legal (as long as the behaviour remains the same).
The real limitation has nothing to do with C at all. The real limitation is linking. So that different object files can be linked together, standards are needed to ensure that whatever the caller in one object file expects matches whatever the callee in another object file provides. This is what's known as the ABI. In some cases (e.g. 64-bit 80x86) there are multiple different ABIs for the exact same architecture.
You can even invent your own ABI that's radically different (and implement your own tools that support your own radically different ABI) and that's perfectly legal as far as the C standards go; even if your ABI requires "pass by reference" for everything (as long as the behaviour remains the same).
Actually, I just compiled this function using GCC:
int foo(int x)
{
goo(&x);
return x;
}
And it generated this code:
_foo:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
leal 8(%ebp), %eax
movl %eax, (%esp)
call _goo
movl 8(%ebp), %eax
leave
ret
This is using GCC 4.9.2 (on 32-bit cygwin if it matters), no optimizations. So in fact, GCC did exactly what you thought it should do and used the argument directly from where the caller pushed it on the stack.
The C programming language mandates that arguments are passed by value. So any modification of an argument (like an x++; as the first statement of your foo) is local to the function and does not propagate to the caller.
Hence, a general calling convention should require copying of arguments at every call site. Calling conventions should be general enough for unknown calls, e.g. thru a function pointer!
Of course, if you pass an address to some memory zone, the called function is free to dereference that pointer, e.g. as in
int goo(int *x) {
static int count;
*x = count++;
return count % 3;
}
BTW, you might use link-time optimizations (by compiling and linking with clang -flto -O2 or gcc -flto -O2) to perhaps enable the compiler to improve or inline some calls between translation units.
Notice that both Clang/LLVM and GCC are free software compilers. Feel free to propose an improvement patch to them if you want to (but since both are very complex pieces of software, you'll need to work some months to make that patch).
NB. When looking into produced assembly code, pass -fverbose-asm to your compiler!
why is the output 44 instead of 12, &x,eip,ebp,&j should be the stack layout for the code
given below i guess, if so then it must have 12 as output, but i am getting it 44?
so help me understand the concept,altough the base pointer is relocated for every instant of execution, the relative must remain unchanged and should not it be 12?
int main() {
int x = 9;
// printf("%p is the address of x\n",&x);
fun(&x );
printf("%x is the address of x\n", (&x));
x = 3;
printf("%d x is \n",x);
return 0;
}
int fun(unsigned int *ptr) {
int j;
printf("the difference is %u\n",((unsigned int)ptr -(unsigned int) &j));
printf("the address of j is %x\n",&j);
return 0;
}
You're assuming that the compiler has packed everything (instead of putting things on specific alignment boundaries), and you're also assuming that the compiler hasn't inlined the function, or made any other optimisations or transformations.
In summary, you cannot make any assumptions about this sort of thing, or rely on any particular behaviour.
No, it should not be 12. It should not be anything. The ISO standard has very little to say about how things are arranged on the stack. An implementation has a great deal of leeway in moving things around and inserting padding for efficiency.
If you pass it through your compiler with an option to generate he assembler code (such as with gcc -S), it will become evident as to what's happening here.
fun:
pushl %ebp
movl %esp, %ebp
subl $40, %esp ;; this is quite large (see below).
movl 8(%ebp), %edx
leal -12(%ebp), %eax
movl %edx, %ecx
subl %eax, %ecx
movl %ecx, %eax
movl %eax, 4(%esp)
movl $.LC2, (%esp)
call printf
leal -12(%ebp), %eax
movl %eax, 4(%esp)
movl $.LC3, (%esp)
call printf
movl $0, %eax
leave
ret
It appears that gcc is pre-generating the next stack frame (the one going down to printf) as part of the prolog for the fun function. That's in this case. Your compiler may be doing something totally different. But the bottom line is: the implementation can do what it likes as long as it doesn't violate the standard.
That's the code from optimisation level 0, by the way, and it gives a difference of 48. When I use gcc's insane optimisation level 3, I get a difference of 4. Again, perfectly acceptable, gcc usually does some pretty impressive optimisations at that level.