X86 Assembly Problems - c

I'm attempting to get through a book on X86 that was written using examples from Visual C++ and Visual Studio. I'm trying to convert the examples for use with gcc. After a number of problems, I finally wound up with code that would at least compile, but now I'm getting segfaults. Here's the code:
assembly.s:
.intel_syntax noprefix
.section .text
.globl CalcSum
.type CalcSum, #function
// extern "C" int CalcSum_(int a, int b, int c)
CalcSum:
// Initialize a stack frame pointer
pushq rbp
mov ebp,esp
// Load the argument values
mov eax,[ebp+8]
mov ecx,[ebp+12]
mov edx,[ebp+16]
// Calculate the sum
add eax, ecx
add eax, edx
// Restore the caller's stack frame pointer
popq rbp
ret
test.c:
#include <stdio.h>
extern int CalcSum(int a, int b, int c);
int main() {
int sum = CalcSum(5,6,7);
printf(" result: %d\n",sum);
return 0;
}
I'm using gcc -o execute test.c assembly.s to compile. If I change all the 32 bit instructions to 64 bit (i.e. ebp to rbp) it will run but give completely random output. Could anyone point out what I'm doing wrong here? Thanks!

As hinted in the comments, it's a matter of calling convention. 32-bit C functions follow the CDECL calling convention in Windows and in Linux. In 64-bit Linux you have to use the System V AMD64 ABI. The 64-bit calling convention of Windows is different. There might be specifics to use functions of the operating system.
32-bit C (GCC):
.intel_syntax noprefix
.section .text
.globl CalcSum
.type CalcSum, #function
// extern "C" int CalcSum_(int a, int b, int c)
CalcSum: // with underscore in Windows: _CalcSum
// Initialize a stack frame pointer
push ebp
mov ebp,esp
// Load the argument values
mov eax,[ebp+8]
mov ecx,[ebp+12]
mov edx,[ebp+16]
// Calculate the sum
add eax, ecx
add eax, edx
// Restore the caller's stack frame pointer
pop ebp
ret
64-bit Linux (GCC):
.intel_syntax noprefix
.section .text
.globl CalcSum
.type CalcSum, #function
// extern "C" int CalcSum_(int a, int b, int c)
CalcSum:
// Load the argument values
mov rax, rdi
add rax, rsi
add rax, rdx
ret
64-bit Windows (MingW-GCC):
.intel_syntax noprefix
.section .text
.globl CalcSum
// .type CalcSum, #function
// extern "C" int CalcSum_(int a, int b, int c)
CalcSum:
// Load the argument values
mov rax, rcx
add rax, rdx
add rax, r8
ret

Related

Intel-x86, confused about the allocation of bytes in the esp register

I've been a bit stuck on this question. Given the following C code:
#include <stdio.h>
#define BUF_SIZE 13
int foo(){
int i;
int B[BUF_SIZE];
for(i = 0; i < BUF_SIZE; i++)
B[i] = 5;
return i;
}
int main(){
foo();
return 0;
}
The following Intel-x86 assembly is generated:
1. .file "code.c"
2. .intel_syntax noprefix
3. .text
4. .globl foo
5. .type foo, #function
6. foo:
7. push ebp
8. mov ebp, esp
9. sub esp, 64
10. mov DWORD PTR [ebp-4], 0
11. jmp .L2
12. .L3:
13. mov eax, DWORD PTR [ebp-4]
14. mov DWORD PTR [ebp-56+eax*4], 5
15. add DWORD PTR [ebp-4], 1
16. .L2:
17. cmp DWORD PTR [ebp-4], 12
18. jle .L3
19. mov eax, DWORD PTR [ebp-4]
20. leave
21. ret
22. .size foo, .-foo
23. .globl main
24. .type main, #function
25. main:
26. push ebp
27. mov ebp, esp
28. call foo
29. mov eax, 0
30. pop ebp
31. ret
32. .size main, .-main
33. .ident "GCC: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516"
34. .section .note.GNU-stack,"",#progbits
I'm a bit stuck trying to determine the meaning of line 9 in the assembly. My understanding is that we subtract from the stack register in order to allocate space on the stack for local variables. I know, then, that 52 bytes are being subtracted for the array B, and another 4 bytes for i. But I'm wondering where the other 8 bytes come from? Are those the return values of foo and main? Any help would be appreciated.
The amount of bytes added onto esp is rounded up to maintain some stack alignment. Imagine, you would only add 57 or something. A function you would call, would then need to realign the stack pointer first before storing a 4-byte integer. Everyone is saved that hassle if everyone keeps the stack aligned.

Non-used Reservated Stack in Intel x86 Assembly

I am in the beginning of learning intel's x86 assembly code and compiled this simple "hello world" c program (without the cfi additions for simplicity):
#include
int main(int argc, char* argv[]) {
printf("hello world!");
return 0;
}
The following x86 code came out:
.file "helloworld.c"
.intel_syntax noprefix
.section .rodata
.LC0:
.string "hello world!"
.text
.globl main
.type main, #function
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR -4[rbp], edi
mov QWORD PTR -16[rbp], rsi
lea rdi, .LC0[rip]
mov eax, 0
call printf#PLT
mov eax, 0
leave
ret
.size main, .-main
.ident "GCC: (Debian 7.2.0-19) 7.2.0"
.section .note.GNU-stack,"",#progbits
The question: Why are those 16 bytes for local variables reserved on the stack but aren't used in any way? The program even does the same, without those lines, so for which reason were they created?

Passing argument from C to Assembly

I'm trying to make a program in C that uses a function from Assembly. Below you can see the code:
sum_c.c
#include <stdio.h>
extern int _assemblySum(int x, int y);
int main(int argc, char *argv[]){
int total;
total = _assemblySum(4, 2);
printf("%d\n", total);
return 0;
}
assembly_Sum.asm
SECTION .DATA
SECTION .TEXT
GLOBAL _assemblySum
_assemblySum:
push rbp
mov rbp, rsp
mov rax, [rbp+16]
mov rbx, [rbp+24]
add rax, rbx
pop rbp
ret
COMPILE
nasm -f elf64 assembly_sum.asm -o assembly_sum.o
gcc c_sum.c assembly_sum.o -o sum
./sum
When I run the program I just get random numbers like -1214984584 or 2046906200. I know that I need to use the registers rdi, rsi, rdx and rcx because the 64bit GNU/Linux compiler uses them (Passing arguments from C to 64bit linux Assembly). But how can I do that?
You may have confused which calling convention is being used.
Linux uses the 64-bit System V calling convention. Under this convention, registers are strongly preferred over the stack for the passing of INTEGER type parameters. The registers for integer passing are used in the following order:
%rdi
%rsi
%rdx
%rcx
%r8
%r9
If additional Integer parameters are used, they will be passed on the stack.
You can find detailed information in the System V ABI specification.

making C wrapper from asm of nasm

This was a C program in the beginning then converted to nasm
but no luck trying to put into shellcode format. (segment fault)
global main
extern printf
SECTION .text align=4
main:
push rbp
mov rbp, rsp
mov eax, L_001
mov rdi, rax
mov eax, 0
call printf
mov eax, 0
leave
ret
SECTION .data align=4
SECTION .bss align=4
SECTION .rodata
L_001:
db 48H, 65H, 6CH, 6CH, 6FH, 20H, 74H, 68H
db 65H, 72H, 65H, 00H
This was written on a x86_64 platform of CentOS
C wrapper program(shellcode format)
char code[] = "\x48\x65\x6c\x6c\x6f\x20\x74\x68\x65\x72\x65\x00";
int main(int argc, char **argv)
{
(*(void(*)())code)();
return 0;
}

Understanding GCC inline assembly with a Hello World program

A friend helped me come up with the following code to use inline assembly in GCC on a 64-bit Windows machine:
int main() {
char* str = "Hello World";
int ret;
asm volatile(
"call puts"
: "=a" (ret), "+c" (str)
:
: "rdx", "rdi", "rsi", "r8", "r9", "r10", "r11");
return 0;
}
After compiling with -S -masm=intel (I prefer Intel syntax), I get this assembly code:
.file "hello.c"
.intel_syntax noprefix
.def __main; .scl 2; .type 32; .endef
.section .rdata,"dr"
.LC0:
.ascii "Hello World\0"
.text
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
push rbp
.seh_pushreg rbp
push rdi
.seh_pushreg rdi
push rsi
.seh_pushreg rsi
mov rbp, rsp
.seh_setframe rbp, 0
sub rsp, 48
.seh_stackalloc 48
.seh_endprologue
call __main
lea rax, .LC0[rip]
mov QWORD PTR -8[rbp], rax
mov rax, QWORD PTR -8[rbp]
mov rcx, rax
/APP
# 7 "hello.c" 1
call puts
# 0 "" 2
/NO_APP
mov DWORD PTR -12[rbp], eax
mov QWORD PTR -8[rbp], rcx
mov eax, 0
add rsp, 48
pop rsi
pop rdi
pop rbp
ret
.seh_endproc
.ident "GCC: (x86_64-posix-seh-rev1, Built by MinGW-W64 project) 4.9.2"
It works, but it sure looks messy with what appears to be superfluous code. Then again, my last experience with assembly was with the 65816 back in the 80s, and it wasn't inline. Anyway, I cleaned up the code, and the following accomplishes the exact same thing, as far as I can tell:
.intel_syntax noprefix
.data:
.ascii "Hello World\0"
.text
.globl main
main:
sub rsp, 48
lea rax, .data[rip]
mov rcx, rax
call puts
mov eax, 0
add rsp, 48
ret
Much simpler. What's all that extra stuff GCC added?
Edit: Not a duplicate because in addition to the structured exception handling, I'm also asking about the callee-saved registers, the call to __main, the explicit size directives, and the APP/NO_APP section.

Resources