Assembly and callstack - c

I'm trying to get an understanding of assembly but unfortunately I have problems to understand the following C code in assembly:
void test_function(int a, int b, int c, int d) {
int flag;
char buffer[10]
flag = 31337;
buffer[0] = 'A';
}
int main() {
test_fuction(1,2,3,4);
}
The assembly of main() looks like this:
push ebp
mov ebp, esp
sub esp,0x18
and esp,0xffffffff0
mov eax,0x0
sub esp,eax
mov DWORD PTR [esp+12], 0x4
mov DWORD PTR [esp+12], 0x3
mov DWORD PTR [esp+12], 0x2
mov DWORD PTR [esp+12], 0x1
call <test_function>
The assembly for test_function(...) looks like this:
push ebp
mov ebp, esp
sub esp,0x28
mov DWORD PTR [ebp-12], 0x7a69 ;this is 31337 in hexadecimal
mov BYTE PTR [ebp-40], 0x41 ;this is the 'A' in ASCII
leave
ret
What is hard for me to understand is:
and esp,0xffffffff0
mov eax,0x0
sub esp,eax
Why are we operating an and with 0xffffffff0 on esp?
And why do we move a 0 to eax and sub the content of eax from esp?
Second:
Through sub esp,0x28 we are allocating 40 bytes of RAM. Why 40? The integer and the 10 chars of the array are altogether only 14 bytes, aren't they?
And why are we moving 0x7a69 to the position [ebp-12] and not to [ebp]? By operating mov ebp, esp I set ebp to the current ESP. Now ESP is pointing to the end of the stack. The last value I pushed on the stack was the ebp by operating push ebp. So EBP (= esp) points behind the saved ebp. So why couldn't I move 0x7a69 to [ebp] just directly behind the saved EBP?
And why is the 'A' moved to [ebp-40]?

This seems to be some standard compiler-generated assembler.
and esp,0xffffffff0
mov eax,0x0
sub esp,eax
The and will make esp a multiple of 16, i.e. alligns it on 16 bytes. Because the stack grows downward it is essentially a substraction, not an addition.
The next mov and add reserve space for the local variables. in main there are no local variables, so their total is 0x0. Because test_function has local variables, 0x28 is moved to eax and added to esp. Probably the compiler has also alligned this on some multiple. Lastly, [ebp-40] is the location on the reserved stack space the compiler has assigned to buffer.

Related

Assembly retrieving buffer to c function parameter

I'm writing an assembly function that will read from IDE through ports.
I'm calling the parameters through x86 base pointer (EBP).
I debugged my kernel.bin (with gdb and qemu) and I that when I'm calling my recv buffer to print, eax will return values like:36h01h10h
IBM Char Table
My disk.asm is divided by read and write. Is it possible that I'm writing it wrong? Is it legal to move directly [ebp+16] to esi (to write)? If I, on read function, move [ebp+16] directly to edi is wrong? I'm using a register poiting to that address and making edi to point to that register:
In my disk.asm, to read the disk I have this:
sub dx, 7 ;dx = 0x1f0
mov ecx, 256
mov edi, bufferrecv
rep insw
(...)
push ebx
mov ebx, [ebp+16]
mov [ebx], long word bufferrecv
pop ebx
mov esp, ebp
pop ebp
ret
And to write disk:
sub dx, 7 ;dx = 0x1f0
mov ecx, 256
mov esi, [ebp+16]
rep outsw
(...)
I'm declaring those functions this way:
Kernel.c
extern int _readd(int sector_count, int nmrsector, STRING in_msg);
extern int writed(int sector_count, int nmrsector, STRING out_msg);
The STRING type was declared inside my types.h as char*

Mixing C and Assembly

I'm doing a program in assembly to read a disk through ports (0x1f0-0x1f7) and I'm mixing it with c. I have a function in assembly that I will call in my c main funtion. My main function as 1 parameter: sectors to read:
Kernel.c
extern int _readd(int nmrsector);
(...)
int sector = 257;
int error = _readd(sector);
if(error == 0) PrintString("Error"); //It is declared on my screen.h file
disk.asm
global _readd
_readd:
push eax
push ebx
push ecx
push edx
push ebp
mov ebp, esp
mov eax, [ebp+8]
mov ecx, eax
cmp ecx, 256
jg short _fail
jne short _good
_fail:
xor eax, eax
leave
ret
_good:
xor eax, eax
mov eax, 12
leave
ret
It crashes when run it with VirtualBox. Any ideas?
If you save CPU registers when you enter a function, you need to restore them when you are finished. Your PUSHs need to be matched with POPs.
Also, if you use a stack frame to access local variables and parameters, setup the frame (push ebp ; mov ebp, esp) before everything, so you can more easily refer to them. Here [ebp+8] doesn't refer to a parameter, because you alter the stack before setting up the frame.

Initialize char[] fails, esi contains wrong value

I want to initialize a char array, but during I do this my programm crashes. Here's my code:
void kernelEnteredMsg() {
char str[] = "Kernel successfully entered!";
}
Here's the disassembly:
push ebp
mov ebp,esp
push edi
push esi
push ebx
sub esp,byte +0x30
lea edx,[ebp-0x2d]
mov ebx,0x402000 ; load an address outside my data segment
mov eax,0x1d
mov edi,edx
mov esi,ebx ; move this address to edi
mov ecx,eax
rep movsb ; here the programm crashes
add esp,byte +0x30
pop ebx
pop esi
pop edi
pop ebp
ret
I don't understand why it loads esi with 0x402000. But this seems to cause the error. Can somebody explain what happens here and how to fix it?
PS: "Kernel successful entered!" is at 0x1000 in binary file.
C code:
void kernelEnteredMsg();
void entryPoint() {
kernelEnteredMsg();
}
void kernelEnteredMsg() {
char str[] = "Kernel successfully entered!";
int size = 28;
}
Calling assembly code:
extern _entryPoint
global _main
section .text
_main: ; start of kernel
nop
; setup ds, es, ss and gs
mov ax, 16
mov ds, ax
mov es, ax
mov ss, ax
mov sp, 0x4000
mov ax, 24
mov gs, ax
mov [gs:0], dword 0x07690748 ; test graphics
call _entryPoint ; enter kernel C code
jmp $
This code does copy the string from the .text section to the local stack, because the char array is not 'const'. This may provide a simple solution if you do not need the string to be modified - just make it const char.
I don't understand why it loads esi with 0x402000.
ESI is the source of the string copy instruction 'rep movsb', EDI is the destination.
The address is constructed by IMAGE_BASE+SECTION (IIRC) in the PE file(assuming it is PE.)
Remember in the file there is a FILE_ALIGN and a SECTION_VIRTUAL_ADDRESS, so a section may be
at position 0x1000 in the file(FILE_ALIGN) and at 0x2000 in memory(VIRTUAL_ADDRESS) resulting in IMAGE_BASE+VIRTUAL_ADDRESS=0x402000.
You can use a PE explorer like CFF Explorer(http://www.ntcore.com/exsuite.php)
to display this(if it's a .bin file it may be unapplicable but it has to have some kind of format)
Another possibility may be a wrong state of the DF-Flag leading to wrong behaviour of the string copy instruction (should not happen, because the compiler should take care of this).
Try inserting
__asm__ ("cld");
before the char str[] or in the __main procedure to set string increment to 'UP'.

How an assembly language works?

I am learning assembly and I have this assembly code and having much trouble understanding it can someone clarify it?
Dump of assembler code for function main:
0x080483ed <+0>: push ebp
0x080483ee <+1>: mov ebp,esp
0x080483f0 <+3>: sub esp,0x10
0x080483f3 <+6>: mov DWORD PTR [ebp-0x8],0x0
0x080483fa <+13>: mov eax,DWORD PTR [ebp-0x8]
0x080483fd <+16>: add eax,0x1
0x08048400 <+19>: mov DWORD PTR [ebp-0x4],eax
0x08048403 <+22>: leave
0x08048404 <+23>: ret
Until now, my understood knowledge is the following:
Push something (don't know what) in ebp register. then move content of esp register into ebp (I think the data of ebp should be overwritten), then subtract 10 from the esp and store it in the esp (The function will take 10 byte, This reg is never used again, so no point of doing this operation). Now assign value 0 to the address pointed by 8 bytes less than ebp.
Now store that address into register eax. Now add 1 to the value pointed by eax (the previous value is lost). Now store the eax value on [ebp-0x4], then leave to the return address of main.
Here is my C code for the above program:
int main(){
int x=0;
int y = x+1;
}
Now, can someone figure out if I am wrong at anything,and I also don't understand the mov at <+13> it adds 1 to the addrs ebp-0x8, but that is the address of int x so, x no longer contain 0. Where am I wrong?
first of all, push ebp and then mov ebp, esp are two instructions that are common at the beggining of a procedure. ESP register is an indicator for the top of the stack - so it changes constantly as the stack grows or shrinks. EBP is a helping register here. First we push content of ebp on stack. then we copy ESP (current stack top adress) to ebp - that is why when we refer to other items on the stack, we use constant value of ebp (and not changing one of esp).
sub esp, 0x10 ; means we reserve 16 bytes on the stack (0x10 is 16 in hex)
now for the real fun:
mov DWORD PTR [ebp-0x8],0x0 ; remember ebp was showing on the stack
; top BEFORE reserving 16 bytes.
; DWORD PTR means Double-word property which is 32 bits.
; so the whole instruction means
; "move 0 to the 32 bits of the stack in a place which
; starts with the adress ebp-8.
; this is our`int x = 0`
mov eax,DWORD PTR [ebp-0x8] ; send x to EAX register.
add eax,0x1` ; add 1 to the eax register
mov DWORD PTR [ebp-0x4],eax ; send the result (which is in eax) to the stack adress
; [ebp-4]
leave ; Cleanup stack (reverse the "mov ebp, esp" from above).
ret ; let's say this instruction returns to the program, (it's slightly more
; complicated than that)
Hope this helps! :)
0x080483ed <+0>: push ebp
0x080483ee <+1>: mov ebp,esp
Setting up the stack frame. Save the old base pointer and set the top of the stack as the new base pointer. This allows local variables and arguments within this function to be referenced relative to ebp (base pointer). The advantage of this is that its value is stable unlike esp which is affected by pushes and pops.
0x080483f0 <+3>: sub esp,0x10
On the x86 platform the stack 'grows' downwards. Generally speaking this means esp has a lower value (address in memory) than ebp. When ebp == esp the stack has not reserved any memory for local variables. This does not mean it is 'empty' - a common usage of a stack is [ebp+0x8] for instance. In this case the code is looking for something on the stack which was previously pushed on prior to the call (this could be arguments in the stdcall convention).
In this case the stack is extended by 16 bytes. In this case more space is reserved than necessary for alignment purposes:
0x080483f3 <+6>: mov DWORD PTR [ebp-0x8],0x0
The 4 bytes at [ebp-0x8] are initialised to the value 0. This is your x local variable.
0x080483fa <+13>: mov eax,DWORD PTR [ebp-0x8]
The 4 bytes at [ebp-0x8] are moved to a register. Arithmetic opcodes can not operate with two memory operands. Data needs to be moved to a register first before the arithmetic is performed. eax now holds the value of your x variable.
0x080483fd <+16>: add eax,0x1
The value of eax is increased so it now holds the value x + 1.
0x08048400 <+19>: mov DWORD PTR [ebp-0x4],eax
Stores the calculated value back on the stack. Notice the local variable is now [ebp-0x4] - this is your y variable.
0x08048403 <+22>: leave
Destroys the stack frame. Essentially maps to pop ebp and restores the old stack base pointer.
0x08048404 <+23>: ret
Pops the top of the stack treating the value as a return address and sets the program pointer (eip) to this value. The return address typically holds the address of the instruction directly after the call instruction that brought execution into this function.

cl.exe produces weird assembly code

I compiled this C code:
void foo() {
int i = 0;
i = 0;
i = 0;
}
and I got this:
push ebp
mov ebp,esp
push ecx
mov dword ptr ss:[ebp-4],0
mov dword ptr ss:[ebp-4],0
mov dword ptr ss:[ebp-4],0
mov esp,ebp
pop ebp
retn
My question is why is there push ecx? and how come there is no sub esp,4 or something to make space on the stack? No compiler options used.
Either way will make 4 bytes of space available on the stack, and the push saves a couple of bytes over the sub. Maybe the compiler writer decided to optimize this case by pushing a register.

Resources