Push address of stack location instead of the value at the address - c

I'm working on a small compiler project, and I can't seem to figure out how to push the address of a stack location instead of the value at that stack location. My goal is to push a stack location address, that holds an integer value, as a void pointer to a C function that prints it. My ultimate goal is to do some pointer-integer arithmetic in the function. I am successful in calling the C function from a runtime library extension, but the issue is just figuring out how to push the address in assembly.
My C function.
void print(void* ptr){
int* int_ptr = (int*)ptr;
printf("*int_ptr is %d\n",*int_ptr);
}
My Assembly
.globl main
main:
pushl %ebp
movl %esp, %ebp
subl $4, %esp
movl $42, %eax
movl %eax, -4(%ebp)
pushl -4(%ebp)
//Instead of the value at -4(%ebp), I would like the address of -4(%ebp)
call print
addl $8, %esp
leave
ret
As for what I have now, it'll crash since I'm casting the value 42 to an address. Can someone please direct me to some reference or resources to learn more?

In general you can get the address of a stack based value by using the LEA instruction to get the effective address of -4(%ebp) and place it in a register. You can then push that register to the stack. The LEA instruction is described in the instruction set reference this way:
Computes the effective address of the second operand (the source operand) and stores it in the first operand (destination operand). The source operand is a memory address (offset part) specified with one of the processors addressing modes; the destination operand is a general-purpose register.
In your code something like this would have worked:
lea -4(%ebp), %eax
push %eax
This should effectively pass the address of -4(%ebp) on the stack for use by your function print.

Related

Segmentation fault when calling x86 Assembly function from C program

I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):
#include <stdio.h>
#include <stdlib.h>
int addition(int a, int b);
int main(void) {
int sum = addition(3, 4);
printf("%d", sum);
return EXIT_SUCCESS;
}
Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
# Push the current EBP (base pointer) to the stack, so that we
# can reset the EBP to its original state after the function's
# execution
push %ebp
# Move the EBP (base pointer) to the current position of the ESP
# register
movl %esp, %ebp
# Read in the parameters of the addition function
# addition(a, b)
#
# Since we are pushing to the stack, we need to obtain the parameters
# in reverse order:
# EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
#
# Utilize advanced indexing in order to obtain the parameters, and
# store them in the CPU's registers
movzbl 8(%ebp), %ebx
movzbl 12(%ebp), %ecx
# Clear the EAX register to store the sum
xorl %eax, %eax
# Add the values into the section of memory storing the return value
addl %ebx, %eax
addl %ecx, %eax
I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).
Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.
You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.
movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.
You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.
I would write:
.text
# Here, we define a function addition
.global addition
addition:
# Prologue:
push %ebp
movl %esp, %ebp
# load first argument
movl 8(%ebp), %eax
# add second argument
addl 12(%ebp), %eax
# epilogue
movl %ebp, %esp # redundant since we haven't touched esp, but will be needed in more complex functions
pop %ebp
ret
In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to
.text
.global addition
addition:
movl 4(%esp), %eax
addl 8(%esp), %eax
ret
You are corrupting the stacke here:
movb %al, 4(%ebp)
To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.
Also EBX must be saved if you intend to use it, but you don't really need it anyway.

pointer to function points the jmp instruction in c

I viewed the disassembly of my c code, and found out that pointer to function actually point the jmp instruction, and doesn't point the real start of the function in memory (doesn't point push ebp instruction, that represents start of function's frame).
I have the followed function (that does basically nothing, it's just an example):
int func2(int a, int b)
{
return 1;
}
I tried to print the address of the function- printf("%p", &func2);
I looked at the disassembly of my code, and found out that the address that is printed is the address of the jmp instuction in assembly code. I would like to get the address that represents the start of function's frame. Is there any way to calculate it from the given address of the jmp instruction?
Moreover, I have the bytes that represents the jmp instruction.
011A11EF E9 CC 08 00 00 jmp func2 (011A1AC0h)
How can I get the address that represents the start of function's frame in memory (011A1AC0h in that case), only from the address of the jmp instruction and from the bytes that represents the jmp instruction itself? I read some information about that, and I found out that it is relative jmp, which means that I need to add the value that jmp holds to the address of the jmp instruction itself. Not sure if that's a good direction for the solution, and if it is, how can I get the value that jmp holds?
E916 is the Intel 64 and IA-32 opcode for a jmp instruction with a rel32 offset. The next four bytes contain the offset. Your disassembler shows them as “CC 08 00 00”, but this is reversed; the offset is 000008CC16, which is 225210. The offset is a signed 32-bit value that is added to the EIP register to obtain the address of the jump target. The EIP contains the address of the next instruction to be executed.
So, in this specific case, take the address of the byte just beyond the jump instruction and add the 32-bit offset.
However:
I count 11 forms of jmp instruction in Intel 64 and IA-32 manual. Who knows what the compiler may use when you make a slight change to source or compiler switches and recompile? You would need to be prepared to decode any form of the jmp instruction, or perhaps other instructions the compiler might use.
Intel has some legacy segment features in its architecture. The code segment on your system might be one big thing so you do not have to worry about that, but I cannot provide assurance.
Your compiler might have used this jmp instruction as a convenient way to create a value for the pointer rather than using the routine’s entry point (the proper term for the instruction where function execution normally begins, not frame) because it makes the linker do the relocation work instead of requiring the compiler to insert instructions to do that work at run-time (specifically, at the time the function address must be evaluated so it can be assigned to the pointer). This is somewhat of a guess, but the compiler might do something else next time. You are treading significantly outside normal computing.
I'm not sure to get your question, but take this sample:
#include <stdio.h>
int foo(int x)
{
return x+1;
}
int main(int argc, char** argv)
{
printf("foo = %p\n", foo);
return 0;
}
Which produces the following disassembly:
foo(int):
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl -4(%rbp), %eax
addl $1, %eax
popq %rbp
ret
.LC0:
.string "foo = %p\n"
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl foo(int), %esi # pass the label argument (2) to printf
movl $.LC0, %edi # pass the format argument (1) to printf
movl $0, %eax
call printf
movl $0, %eax
leave
ret
As you can see, only the label is passed to printf. This label is resolved as an address by the compiler.
Also notice that it will be hard for you to get an absolute address of a running binary: the ASLR (Address Space Layout Randomization will choose a random base address for the binary. The offsets inside the binary still holds, hence relative calls.
On X86 machines E9 is the opcode for JMP rel16/32. So the cpu is going to use the value 0x000008CC as jump offset. The base address is the address of the instruction following the JMP instruction.

Assembly x86 - "leave" Instruction

It's said that the "leave" instruction is similar to:
movl %ebp, %esp
popl %ebp
I understand the movl %ebp, %esp part, and that it acts to release stored up memory (as discussed in this question).
But what is the purpose of the popl %ebp code?
LEAVE is the counterpart to ENTER. The ENTER instruction sets up a stack frame by first pushing EBP onto the stack and then copies ESP into EBP, so LEAVE has to do the opposite, i.e. copy EBP to ESP and then restore the old EBP from the stack.
See the section named PROCEDURE CALLS FOR BLOCK-STRUCTURED LANGUAGES in Intel's Software Developer's Manual Vol 1 if you want to read more about how ENTER and LEAVE work.
enter n,0 is exactly equivalent to (and should be replaced with)
push %ebp
mov %esp, %ebp # ebp = esp, mov ebp,esp in Intel syntax
sub $n, %esp # allocate space on the stack. Omit if n=0
leave is exactly equivalent to
mov %ebp, %esp # esp = ebp, mov esp,ebp in Intel syntax
pop %ebp
enter is very slow and compilers don't use it, but leave is fine. (http://agner.org/optimize). Compilers do use leave if they make a stack frame at all (at least gcc does). But if esp is already equal to ebp, it's most efficient to just pop ebp.
The popl instruction restores the base pointer, and the movl instruction restores the stack pointer. The base pointer is the bottom of the stack, and the stack pointer is the top. Before the leave instruction, the stack looks like this:
----Bottom of Caller's stack----
...
Caller's
Variables
...
Function Parameters
----Top of Caller's Stack/Bottom of Callee's stack---- (%ebp)
...
Callee's
Variables
...
---Bottom of Callee's stack---- (%esp)
After the movl %ebp %esp, which deallocates the callee's stack, the stack looks like this:
----Bottom of Caller's stack----
...
Caller's
Variables
...
Function Parameters
----Top of Caller's Stack/Bottom of Callee's stack---- (%ebp) and (%esp)
After the popl %ebp, which restores the caller's stack, the stack looks like this:
----Bottom of Caller's stack---- (%ebp)
...
Caller's
Variables
...
Function Parameters
----Top of Caller's Stack---- (%esp)
The enter instruction saves the bottom of the caller's stack and sets the base pointer so that the callee can allocate their stack.
Also note, that, while most C compilers allocate the stack this way(at least with optimization turn'd off), if you write an assembly language function, you can just use the same stack frame if you want to, but you have to be sure to pop everything off the stack that you push on it or else you'll jump to a junk address when you return(this is because call <somewhere> means push <ret address>[or push %eip], jmp <somewhere>, and ret means jump to the address on the top of the stack[or pop %eip]. %eip is the register that holds the address of the current instruction).

C Code represented as Assembler Code - How to interpret?

I got this short C Code.
#include <stdint.h>
uint64_t multiply(uint32_t x, uint32_t y) {
uint64_t res;
res = x*y;
return res;
}
int main() {
uint32_t a = 3, b = 5, z;
z = multiply(a,b);
return 0;
}
There is also an Assembler Code for the given C code above.
I don't understand everything of that assembler code. I commented each line and you will find my question in the comments for each line.
The Assembler Code is:
.text
multiply:
pushl %ebp // stores the stack frame of the calling function on the stack
movl %esp, %ebp // takes the current stack pointer and uses it as the frame for the called function
subl $16, %esp // it leaves room on the stack, but why 16Bytes. sizeof(res) = 8Bytes
movl 8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because
imull 12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
movl %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
movl $0, -4(%ebp) // here as well
movl -8(%ebp), %eax // and here again.
movl -4(%ebp), %edx // also here
leave
ret
main:
pushl %ebp // stores the stack frame of the calling function on the stack
movl %esp, %ebp // // takes the current stack pointer and uses it as the frame for the called function
andl $-8, %esp // what happens here and why?
subl $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
movl $3, 20(%esp) // 3 gets pushed on the stack
movl $5, 16(%esp) // 5 also get pushed on the stack
movl 16(%esp), %eax // what does 16(%esp) mean and what happened with z?
movl %eax, 4(%esp) // we got the here as well
movl 20(%esp), %eax // and also here
movl %eax, (%esp) // what does happen in this line?
call multiply // thats clear, the function multiply gets called
movl %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
movl $0, %eax // I suppose, this line is because of "return 0;"
leave
ret
Negative references relative to %ebp are for local variables on the stack.
movl 8(%ebp), %eax // I don't know quite what "8(%ebp) mean? It has to do something with res, because`
%eax = x
imull 12(%ebp), %eax // here is the multiplication done. And again "12(%ebp).
%eax = %eax * y
movl %eax, -8(%ebp) // Now, we got a negative number in front of. How to interpret this?
(u_int32_t)res = %eax // sets low 32 bits of res
movl $0, -4(%ebp) // here as well
clears upper 32 bits of res to extend 32-bit multiplication result to uint64_t
movl -8(%ebp), %eax // and here again.
movl -4(%ebp), %edx // also here
return ret; //64-bit results are returned as a pair of 32-bit registers %edx:%eax
As for the main, see x86 calling convention which may help making sense of what happens.
andl $-8, %esp // what happens here and why?
stack boundary is aligned by 8. I believe it's ABI requirement
subl $24, %esp // here, it leaves room for local variables, but why 24 bytes? a, b, c: the size of each of them is 4 Bytes. So 3*4 = 12
Multiples of 8 (probably due to alignment requirements)
movl $3, 20(%esp) // 3 gets pushed on the stack
a = 3
movl $5, 16(%esp) // 5 also get pushed on the stack
b = 5
movl 16(%esp), %eax // what does 16(%esp) mean and what happened with z?
%eax = b
z is at 12(%esp) and is not used yet.
movl %eax, 4(%esp) // we got the here as well
put b on the stack (second argument to multiply())
movl 20(%esp), %eax // and also here
%eax = a
movl %eax, (%esp) // what does happen in this line?
put a on the stack (first argument to multiply())
call multiply // thats clear, the function multiply gets called
multiply returns 64-bit result in %edx:%eax
movl %eax, 12(%esp) // it looks like the same as two lines before, except it contains the number 12
z = (uint32_t) multiply()
movl $0, %eax // I suppose, this line is because of "return 0;"
yup. return 0;
Arguments are pushed onto the stack when the function is called. Inside the function, the stack pointer at that time is saved as the base pointer. (You got that much already.) The base pointer is used as a fixed location from which to reference arguments (which are above it, hence the positive offsets) and local variables (which are below it, hence the negative offsets).
The advantage of using a base pointer is that it is stable throughout the entire function, even when the stack pointer changes (due to function calls and new scopes).
So 8(%ebp) is one argument, and 12(%ebp) is the other.
The code is likely using more space on the stack than it needs to, because it is using temporary variables that could be optimized out of you had optimization turned on.
You might find this helpful: http://en.wikibooks.org/wiki/X86_Disassembly/Functions_and_Stack_Frames
I started typing this as a comment but it was getting too long to fit.
You can compile your example with -masm=intel so the assembly is more readable. Also, don't confuse the push and pop instructions with mov. push and pop always increments and decrements esp respectively before derefing the address whereas mov does not.
There are two ways to store values onto the stack. You can either push each item onto it one item at a time or you can allocate up-front the space required and then load each value onto the stackslot using mov + relative offset from either esp or ebp.
In your example, gcc chose the second method since that's usually faster because, unlike the first method, you're not constantly incrementing esp before saving the value onto the stack.
To address your other question in comment, x86 instruction set does not have a mov instruction for copying values from memory location a to another memory location b directly. It is not uncommon to see code like:
mov eax, [esp+16]
mov [esp+4], eax
mov eax, [esp+20]
mov [esp], eax
call multiply(unsigned int, unsigned int)
mov [esp+12], eax
Register eax is being used as an intermediate temporary variable to help copy data between the two stack locations. You can mentally translate the above as:
esp[4] = esp[16]; // argument 2
esp[0] = esp[20]; // argument 1
call multiply
esp[12] = eax; // eax has return value
Here's what the stack approximately looks like right before the call to multiply:
lower addr esp => uint32_t:a_copy = 3 <--. arg1 to 'multiply'
esp + 4 uint32_t:b_copy = 5 <--. arg2 to 'multiply'
^ esp + 8 ????
^ esp + 12 uint32_t:z = ? <--.
| esp + 16 uint32_t:b = 5 | local variables in 'main'
| esp + 20 uint32_t:a = 3 <--.
| ...
| ...
higher addr ebp previous frame

Help translating from assembly to C

I have some code from a function
subl $24, %esp
movl 8(%ebp), %eax
cmpl 12(%ebp), %eax
Before the code is just the 'ENTER' command and afterwards there's an if statement to return 1 if ebp > eax or 0 if it's less. I'm assuming cmpl means compare, but I can't tell what the concrete values are. Can anyone tell me what's happening?
Yes cmpl means compare (with 4-byte arguments). Suppose the piece of code is followed by a jg <addr>:
movl 8(%ebp), %eax
cmpl 12(%ebp), %eax
jg <addr>
Then the code is similar to
eax = ebp[8];
if (eax > ebp[12])
goto <addr>;
Your code fragment resembles the entry code used by some processors and compilers. The entry code is assembly code that a compiler issues when entering a function.
Entry code is responsible for saving function parameters and allocating space for local variables and optionally initializing them. The entry code uses pointers to the storage area of the variables. Some processors use a combination of the EBP and ESP registers to point to the location of the local variables (and function parameters).
Since the compiler knows where the variables (and function parameters) are stored, it drops the variable names and uses numerical indexing. For example, the line:
movl 8(%ebp), %eax
would either move the contents of the 8th local variable into the register EAX, or move the value at 8 bytes from the start of the local area (assuming the the EBP register pointers to the start of the local variable area).
The instruction:
subl $24, %esp
implies that the compiler is reserving 24 bytes on the stack. This could be to protect some information in the function calling convention. The function would be able to use the area after this for its own usage. This reserved area may contain function parameters.
The code fragment you supplied looks like it is comparing two local variables inside a function:
void Unknown_Function(long param1, long param2, long param3)
{
unsigned int local_variable_1;
unsigned int local_variable_2;
unsigned int local_variable_3;
if (local_variable_2 < local_variable_3)
{
//...
}
}
Try disassembling the above function and see how close it matches your code fragment.
This is a comparison between (EBP + 8) and (EBP + 12). Based on the comparison result, the cmpl instruction sets flags that are used by following jump instructions.
In Mac OS X 32 bit ABI EBP + 8 is the first function parameter, and EBP + 12 is the second parameter.

Resources