Segfault while calling C function (printf) from Assembly - c

I am using NASM on linux to write a basic assembly program that calls a function from the C libraries (printf). Unfortunately, I am incurring a segmentation fault while doing so. Commenting out the call to printf allows the program to run without error.
; Build using these commands:
; nasm -f elf64 -g -F stabs <filename>.asm
; gcc <filename>.o -o <filename>
;
SECTION .bss ; Section containing uninitialized data
SECTION .data ; Section containing initialized data
text db "hello world",10 ;
SECTION .text ; Section containing code
global main
extern printf
;-------------
;MAIN PROGRAM BEGINS HERE
;-------------
main:
push rbp
mov rbp,rsp
push rbx
push rsi
push rdi ;preserve registers
****************
;code i wish to execute
push text ;pushing address of text on to the stack
;x86-64 uses registers for first 6 args, thus should have been:
;mov rdi,text (place address of text in rdi)
;mov rax,0 (place a terminating byte at end of rdi)
call printf ;calling printf from c-libraries
add rsp,8 ;reseting the stack to pre "push text"
**************
pop rdi ;preserve registers
pop rsi
pop rbx
mov rsp,rbp
pop rbp
ret

x86_64 does not use the stack for the first 6 args. You need to load them in the proper registers. Those are:
rdi, rsi, rdx, rcx, r8, r9
The trick I use to remember the first two is to imagine the function is memcpy implemented as rep movsb,

You're calling a varargs function -- printf expects a variable number of arguments and you have to account for that in the argument stack. See here: http://www.csee.umbc.edu/portal/help/nasm/sample.shtml#printf1

Related

Compiling C to 32-bit assembly with GCC doesn't match a book

I have been trying to compile this C program to assembly but it hasn't been working fine.
I am reading
Dennis Yurichev Reverse Engineering for Beginner but I am not getting the same output. Its a simple hello world statement. I am trying to get the 32 bit output
#include <stdio.h>
int main()
{
printf("hello, world\n");
return 0;
}
Here is what the book says the output should be
main proc near
var_10 = dword ptr -10h
push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, offset aHelloWorld ; "hello, world\n"
mov [esp+10h+var_10], eax
call _printf
mov eax, 0
leave
retn
main endp
Here are the steps;
Compile the print statement as a 32bit (I am currently running a 64bit pc)
gcc -m32 hello_world.c -o hello_world
Use gdb to disassemble
gdb file
set disassembly-flavor intel
set architecture i386:intel
disassemble main
And i get;
lea ecx,[esp+0x4]
and esp,0xfffffff0
push DWORD PTR [ecx-0x4]
push ebp
mov ebp,esp
push ebx
push ecx
call 0x565561d5 <__x86.get_pc_thunk.ax>
add eax,0x2e53
sub esp,0xc
lea edx,[eax-0x1ff8]
push edx
mov ebx,eax
call 0x56556030 <puts#plt>
add esp,0x10
mov eax,0x0
lea esp,[ebp-0x8]
pop ecx
pop ebx
pop ebp
lea esp,[ecx-0x4]
ret
I have also used
objdump -D -M i386,intel hello_world> hello_world.txt
ndisasm -b32 hello_world > hello_world.txt
But none of those are working either. I just cant figure out what's wrong. I need some help. Looking at you Peter Cordes ^^
The output from the book looks like MSVC, not GCC. GCC will definitely not ever emit main proc because that's MASM syntax, not valid GAS syntax. And it won't do stuff like var_10 = dword ptr -10h.
(And even if it did, you wouldn't see assemble-time constant definitions in disassembly, only in the compiler's asm output which is what the book suggested you look at. gcc -S -masm=intel output. How to remove "noise" from GCC/clang assembly output?)
So there are lots of differences because you're using a different compiler. Even modern versions of MSVC (on the Godbolt compiler explorer) make somewhat different asm, for example not bothering to align ESP by 16, perhaps because more modern Windows versions, or CRT startup code, already does that?
Also, your GCC is making PIE executables by default, so use -fno-pie -no-pie. 32-bit PIE sucks for efficiency and for ease of understanding. See How do i get rid of call __x86.get_pc_thunk.ax. (Also 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIE executables, mostly focused on 64-bit code)
The extra clunky stack-alignment in main's prologue is something that GCC8 optimized for functions that don't also need alloca. But it seems even current GCC10 emits the full un-optimized version when you don't enable optimization :(.
Why is gcc generating an extra return address? and Trying to understand gcc's complicated stack-alignment at the top of main that copies the return address
Optimizing printf to puts: see How to get the gcc compiler to not optimize a standard library function call like printf? and -O2 optimizes printf("%s\n", str) to puts(str). gcc -fno-builtin-printf would be one way to make that not happen, or just get used to it. GCC does a few optimizations even at -O0 that other compilers only do at higher optimization levels.
MSVC 19.10 compiles your function like this (on the Godbolt compiler explorer) with optimization disabled (the default, no compiler options).
_main PROC
push ebp
mov ebp, esp
push OFFSET $SG4501
call _printf
add esp, 4
xor eax, eax
pop ebp
ret 0
_main ENDP
_DATA SEGMENT
$SG4501 DB 'hello, world', 0aH, 00H
GCC10.2 still uses an over-complicated stack alignment dance in the prologue.
.LC0:
.string "hello, world"
main:
lea ecx, [esp+4]
and esp, -16
push DWORD PTR [ecx-4]
push ebp
mov ebp, esp
push ecx
sub esp, 4
# end of function prologue, I think.
sub esp, 12 # make sure arg will be 16-byte aligned
push OFFSET FLAT:.LC0 # push a pointer
call puts
add esp, 16 # pop the arg-passing space
mov eax, 0 # return 0
mov ecx, DWORD PTR [ebp-4] # undo stack alignment.
leave
lea esp, [ecx-4]
ret
Yes, this is super inefficient. If you called your function anything other than main, it would already assume ESP was aligned by 16 on function entry:
# GCC10.2 -m32 -O0
.LC0:
.string "hello, world"
foo:
push ebp
mov ebp, esp
sub esp, 8 # reach a 16-byte boundary, assuming ESP%16 = 12 on entry
#
sub esp, 12
push OFFSET FLAT:.LC0
call puts
add esp, 16
mov eax, 0
leave
ret
So it still doesn't combine the two sub instructions, but you did tell it not to optimize so braindead code is expected. See Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for example.
My GCC will very eagerly swap a call to printf to puts! I did not manage to find the command line options that would make the compiler to not do this. I.e. the program has the same external behaviour but the machine code is that of
#include <stdio.h>
int main(void)
{
puts("hello, world");
}
Thus, you'll have really hard time trying to get the exact same assembly as in the book, as the assembly from that book has a call to printf instead of puts!
First of all you compile not decompile.
You get a lots of noise as you compile without the optimizations. If you compile with optimizations you will get much smaller code almost identical with the one you have (to prevent change from printf to puts you need to remove the '\n' https://godbolt.org/z/cs4qe9):
.LC0:
.string "hello, world"
main:
lea ecx, [esp+4]
and esp, -16
push DWORD PTR [ecx-4]
push ebp
mov ebp, esp
push ecx
sub esp, 16
push OFFSET FLAT:.LC0
call puts
mov ecx, DWORD PTR [ebp-4]
add esp, 16
xor eax, eax
leave
lea esp, [ecx-4]
ret
https://godbolt.org/z/xMqo33

Add two numbers in assembly

I'm just getting started with assembly and I wanted to create a simple program that adds two numbers and prints the result
This is what I have so far:
.globl main
.type main, #function
main:
movl $14, %eax
movl $10, %ebx
add %eax, %ebx
call printf
From my understanding here is what's happening line by line
Line 1: I'm creating a label main that can be accessed by the linker
Line 2: I'm specifying the type of label main to a function
Line 3: I begin my definition of main
Line 4: I store the numeric value 14 into the general register eax
Line 5: I store the numeric value 10 into the general register ebx
Line 6: I add the values at eax and ebx and store the result in ebx
Line 7: I call the function printf(here's where I get confused)
How do I specify what value at which register gets printed?
Also, how do I complete this program? Currently when run, the program results in a segmentation fault.
SECTION .data
extern printf
global main
fmt:
db "%d", 10, 0
SECTION .text
main:
mov eax, 14
mov ebx, 10
add eax, ebx
push eax
push fmt
call printf
mov eax, 1
int 0x80
Unfortunately I don't know which compiler/assembler you are using, and I'm not familiar with at&t syntax so I have given you a working example in Intel style x86 for Nasm.
$ nasm -f elf32 test.s -o test.o
$ gcc test.o -m32 -o test
$ ./test
24
In order to use printf you need to actually push the arguments for it onto the stack, I do this here in reverse order (push the last arguments first):
push eax
push fmt
EAX contains the result of add eax, ebx and the label 'fmt' is an array of chars: "%d\n\0" (%d format, newline, null terminator).
After calling printf you need to actually exit your program with the exit system call, otherwise (at least for me) the program will segfault AFTER printf even though it worked and you won't see the result.
So these two lines:
mov eax, 1
int 0x80
are performing the sys_exit system call by placing the ordinal of exit on x86 (1) into EAX, and then invoking interrupt 0x80, this exits the program cleanly.

How can I change the value of a single byte using NASM?

Using NASM, I need to change a character in a string at a given index and print the string in its new form. Here is a simplified version of my code:
;test_code.asm
global main
extern printf
output_str: db "----------"
index: dq 7
main:
push rbp
mov rdi, output_str
mov rax, index
mov byte[rdi + rax], 'x'
xor rax, rax
call printf
pop rbp
ret
I then compile using:
nasm -felf64 test_code.asm && gcc test_code.o -lm
and get a seg fault. Would someone please point out the flaw here? I can't seem to find it myself.
your string is in the .text section of the executable, which is read only by default. Either you allocate a buffer on the stack, copy the string and you modify it there, or you put the string in the .data section (which is read/write) using the section directive. In this last case, notice that the character replacement will be persistent, i.e. even later in the program the string will remain modified;
if you want to print that string with printf it has to be NUL-terminated. Add a ,0 to the end of the db line;
that mov rax, index is wrong - index is the address of the qword you wrote above, while you actually want to copy in rax the datum wrote there; you probably want mov rax, [index].
So, something like
;test_code.asm
global main
extern printf
section .data
output_str:
db "----------",0
section .text
index:
dq 7
main:
push rbp
mov rdi, output_str
mov rax, [index]
mov byte[rdi + rax], 'x'
xor rax, rax
call printf
pop rbp
ret

Calling _printf in assembly loop, only outputting once

I'm learning assembly and I have a very basic loop here
segment .data
msg: db '%d',10,0
segment .text
global _asm_main
extern _printf
_asm_main:
push DWORD 5 ; Should loop 5 times
call dump_stack
add esp,4
ret
dump_stack:
push ebp
mov ebp, esp
mov ecx, 0
loop_start:
cmp ecx,[ebp+8] ;compare to the first param of dump_stack, (5)
jnle loop_end
push ecx ;push the value of my loop onto the stack
push DWORD msg ;push the msg (%d) should just print the value of my loop
call _printf
add esp, 8 ;clear the stack
inc ecx ;increment ecx
jmp loop_start ; go back to my loop start
loop_end:
mov esp, ebp
pop ebp
ret
My output looks something like this
program.exe
0
Just 0, then a newline. I tried to verify the loop was executing by moving my printf to the loop_end part, and it came out with ecx as 6, which is correct. So the loop is executing but printf is not... What am I doing wrong?
(Also, the function is called dump stack because it was initially supposed to dump the details of the stack, but that didn't work because of the same reason here)
And I am compiling with nasm -f win32 program.asm -o program.o
Then I have a cpp file that includes windows.h, and I compiled it with gcc -c include
and finally I linked them with gcc -o program program.o include.o
and I run program.exe
My guess is that printf() modifies ecx, it becomes >= [ebp+8] and your loop body executes only once. If that's the case, you need to read up on the calling conventions used in your compiler and manually preserve and restore the so-called volatile registers (which a called function can freely modify without restoring). Note that there may be several different calling conventions. Use the debugger!

x86-64 ELF initial stack layout when calling glibc

Basically, I read through parts of http://www.nasm.us/links/unix64abi and at page 29, it shows the initial process stack of a C program.
My question is: I'm trying to interface with glibc from x86-64 nasm and based on what the above shows, argc should be at rsp. So the following code should print argc:
[SECTION .data]
PrintStr: db "You just entered %d arguments.", 10, 0
[SECTION .bss]
[SECTION .text]
extern printf
global main
main:
mov rax, 0 ; Required for functions taking in variable no. of args
mov rdi, PrintStr
mov rsi, [rsp]
call printf
ret
But it doesn't. Can someone enlighten me if I have made any mistakes in my code or tell me what the actual stack structure is?
Thanks!
UPDATE: I just randomly tried some offsets and changing the "mov rsi, [rsp]" to "mov rsi, [rsp+28]" did the trick.
But this means that the stack structure shown is wrong. Does anyone know what the initial stack layout is for an x86-64 elf? An equivalent of http://asm.sourceforge.net/articles/startup.html would be really nice.
UPDATE 2:
I left out how I build this code. I do it by:
nasm -f elf64 -g <filename>
gcc <filename>.o -o <outputfile>
The initial stack layout contains argc at the stack pointer, followed by the array char *argv[], not a pointer to it like main receives. Therefore, to call main, you need to do something like:
pop %rdi
mov %rsp,%rsi
call main
In reality there is usually a wrapper function that calls main, rather than the startup code doing it directly.
If you want to simply print argv[0], you could do something like:
pop %rdi
pop %rdi
call puts
xor %edi,%edi
jmp exit

Resources