Segfault when trying to use external functions in assembly [duplicate] - c

So I am trying to write some code using x86 and I can't seem to get it to move contents of a register to a spot in memory.
The code is just this
global main
SECTION .DATA
var_i: DD 0
SECTION .TEXT
main:
push DWORD 4
pop EAX
mov [var_i], EAX
mov EAX, 0
ret
I am using nasm and gcc on the code.
The problem I am having is that whenever I try to move to the spot in memory it segfaults

What kind of system/object format are you using? I'm guessing you're using ELF on Linux or Unix, as that would explain your problem:
Section names in ELF are case sensitive, and most ELF-based OS's the special sections .text and .data are understood, but your sections .TEXT and .DATA have no meaning. As a result, they just get stuck into the executable after the other sections and get the same access permissions. If you're just linking the above code, that will be after the .fini section, so it will executable and read-only. So when you try to write to the variable, you get a segfault.
Change your code to use .data and .text as section names and it should work.

Related

Linking and calling printf from gas assembly

There are a few related questions to this which I've come across, such as Printf with gas assembly and Calling C printf from assembly but I'm hoping this is a bit different.
I have the following program:
.section .data
format:
.ascii "%d\n"
.section .text
.globl _start
_start:
// print "55"
mov $format, %rdi
mov $55, %rsi
mov $0, %eax
call printf # how to link?
// exit
mov $60, %eax
mov $0, %rdi
syscall
Two questions related to this:
Is it possible to use only as (gas) and ld to link this to the printf function, using _start as the entry point? If so, how could that be done?
If not, other than changing _start to main, what would be the gcc invocation to run things properly?
It is possible to use ld, but not recommended: if you use libc functions, you need to initialise the C runtime. That is done automatically if you let the C compiler provide _start and start your program as main. If you use the libc but not the C runtime initialisation code, it may seem to work, but it can also lead to strange spurious failure.
If you start your program from main (your second case) instead, it's as simple as doing gcc -o program program.s where program.s is your source file. On some Linux distributions you may also need to supply -no-pie as your program is not written in PIC style (don't worry about this for now).
Note also that I recommend not mixing libc calls with raw system calls. Instead of doing a raw exit system call, call the C library function exit. This lets the C runtime deinitialise itself correctly, including flushing any IO streams.
Now if you assemble and link your program as I said in the first paragraph, you'll notice that it might crash. This is because the stack needs to be aligned to a multiple of 16 bytes on calls to functions. You can ensure this alignment by pushing a qword of data on the stack at the beginning of each of your functions (remember to pop it back off at the end).

Static executable segfaults if location counter is initialized as too small or too large in linker script

I'm trying to generate a static executable for this program (with musl):
main.S:
.section .text
.global main
main:
mov $msg, %rdi
mov $0, %rax
call printf
mov %rax, %rdi
mov $60, %rax
syscall
msg:
.ascii "hello world from printf\n\0"
Compilation command:
clang -g -c main.S -o main.o
Linking command (musl libc is placed in musl directory (version 1.2.1)):
ld main.o musl/crt1.o -o sm -Tstatic.ld -static -lc -lm -Lmusl
Linker script (static.ld):
ENTRY(_start)
SECTIONS
{
. = 0x100e8;
}
This config results in a working executable, but if I change the location counter offset to 0x10000 or 0x20000, the resulting executable crashes during startup with a segfault. On debugging I found that musl initialization code tries to read the program headers (location received in aux vector), and for some reason the memory address of program header as given by aux vector is unmapped in our address space.
What is the cause of this behavior? What exactly is the counter offset in a linker script? How does it affect the linker output other than altering the load address?
Note: The segfault occurs when the the musl initialization code tries to access program headers
There are a few issues here.
Your main.S has a stack alignment bug: on x86_64, you must realign the stack to 16-byte boundary before calling any other function (you can assume 8-byte alignment on entry).
Without this, I get a crash inside printf due to movaps %xmm0,0x40(%rsp) with misaligned $rsp.
Your link order is wrong: crt1.o should be linked before main.o
When you don't leave SIZEOF_HEADERS == 0xe8 space before starting your .text section, you are leaving it up to the linker to put program headers elsewhere, and it does. The trouble is: musl (and a lot of other code) assumes that the file header and program headers are mapped in (but the ELF format doesn't require this). So they crash.
The right way to specify start address:
ENTRY(_start)
SECTIONS
{
. = 0x10000 + SIZEOF_HEADERS;
}
Update:
Why does the order matter?
Linkers (in general) will assemble initializers (constructors) left to right. When you call standard C library routines from main(), you expect the standard library to have initialized itself before main() was called. Code in crt1.o is responsible for performing such initialization.
If you link in the wrong order: crt1.o after main.o, construction may not happen correctly. Whether you'll be able to observe this depends on implementation details of the standard library, and exactly what parts of it you are using. So your binary may appear to work correctly. But it is still better to link objects in the correct order.
I'm leaving 0x10000 space, isn't it enough for headers?
You are interfering with the built-in default linker script, and instead giving it incomplete specification of how to lay out your program in memory. When you do so, you need to know how the linker will react. Different linkers will react differently.
The binutils ld reacts by not emitting a LOAD segment covering program headers. The ld.lld reacts differently -- it actually moves .text past program headers.
The resulting binaries still crash though, because the binary layout is not what the kernel expects, and the kernel-supplied AT_PHDR address in the aux vector is wrong.
It looks like the kernel expects the first LOAD segment to be the one which contains program headers. Arguably that is a bug in the kernel -- nothing in the ELF spec requires this. But all normal binaries do have program headers in the first LOAD segment, so you'll just have to do the same (or convince kernel developers to add code to handle your weird binary layout).

Using inline-assembly to read the content of .rodata section of ELF binary

So I have been studying ELF binary and came across this question of whether it is possible to read the ELF data section using an inline assembly (assuming you know where the section is located).
After searching for a bit, I found a few links that asked a similar question, but I am struggling a bit to put them together for my use.
Retrieving Offsets, Strings and Virtual Address in .rodata and .rodata1
x86 ASM Linux - Using the .bss Section
dword ptr usage confusion
The question I have is, let's say I have the content of a section (custom .rodata section I added using objcopy) as the following:
╰─$ objdump -s -j .rodata_custom hello ↵
hello: file format elf64-x86-64
Contents of section .rodata_custom:
4ab3ac 42796520 576f726c 64 Bye World
Using the inline assembly in the C program, I would like to read the content of this section (either ASCII code or string literal, whichever one is possible).
From my understanding of inline assembly, the solution I can think of is using something like the
mov reg, DWORD PTR [address of section]
mov variable, reg
I statically compiled a binary, so I won't have to deal with relocation (although dynamically compiling won't be too much of an issue since this new data section will always be adjacent to the original .rodata section), and from disassembling the binary, I know the address of the section to read is 4ab3ac.
Here is my attempt at solving my problem:
int main() {
char *test;
uintptr_t addr = 0x4ab3ac;
asm volatile (
"mov %%rdx, dword ptr [%0]\n\t"
"mov %%rdx, %[test]\n\t"
: [test]"=a"(test)
: "r"(addr)
:
);
printf("%p\n", test);
return 0;
}
and unfortunately, it results in an error stating that Error: junk `[%rax]' after expression. I feel like I'm close, but missing something or misunderstanding somewhere...
I hope my question and intent make sense.
If full code (source code + Makefile) is necessary to understand the question, please let me know.
Kind regards,

Printing strings in Assembly, calling that function in C [duplicate]

This question already has an answer here:
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
(1 answer)
Closed 2 years ago.
all. I was trying to get into programming with NASM, and I also wanted to learn how to make those functions callable in C. I am fairly certain the code that I have so far is correct, in that I need to set up a stack frame, and then undo that stack frame before I return from the routine. I also know I need to return a zero to ensure that there were no errors. I am using debian linux as well, in case I need to adjust for my OS.
The code:
global hello
section .data
message: db "Hello, world!",0 ; A C string needs the null terminator.
section .text
hello:
push rbp ; C calling convention demands that a
mov rbp,rsp ; stack frame be set up like so.
;THIS IS WHERE THE MAGIC (DOESN'T) HAPPEN
pop rbp ; Restore the stack
mov rax,0 ; normal, no error, return value
ret ; return
I feel as if I should point out that I ask this because all of the programs I found made external calls to printf. I do not wish to do this, I would really like to learn how to print things in assembly. So I suppose my questions are: What are the calling conventions for C functions in NASM? How do I print a string in NASM 64bit assembly?
Also, to make sure I have this part right, is this the proper way to call the assembly function in C?
#include <stdio.h>
int main() {
hello();
return 0;
}
EDIT: Okay, I was able to work this out. Here's the assembly code. I assembled the .asm file along with the .c file using nasm -f elf64 -l hello.lst hello.asm && gcc -o hello hello.c hello.o
section .text
global hello
hello:
push rbp ; C calling convention demands that a
mov rbp,rsp ; stack frame be set up like so.
mov rdx,len ; Message length
mov rcx,message ; Message to be written
mov rax,4 ; System call number (sys_write)
int 0x80 ; Call kernel
pop rbp ; Restore the stack
mov rax,0 ; normal, no error, return value
ret
section .data
message: db "Hello, world!",0xa ; 0xa represents the newline char.
len: equ $ - message
The relevant C code (hello.c) looked like this:
int main(int argc, char **argv) {
hello();
return 0;
}
Some explanations include the lack of an #include, due to the I/O being done in the assembly file. Another thing that I needed to see to believe was that all the work was not done in assembly, as I did not have a _start identifier, or whatever that's called. Definitely need to learn more about system calls. Thank you so much to everyone who pointed me in the right direction.
As was cleared up in comments, any interaction between the world outside and your code is done through system calls. C stdio functions format text into an output buffer, then write it with write(2). Or read(2) into an input buffer, and scanf or read lines from that.
Writing in asm doesn't mean you should avoid libc functions when they're useful, e.g. printf/scanf. Usually it only makes sense to write small parts of a program in asm for speed. e.g. write one function that has a hot loop in asm, and call it from C or whatever other language. Doing the I/O with all the necessary error-checking of system call return values would not be very fun in asm. If you're curious what happens under the hood, read the compiler output and/or single-step the asm. You'll sometimes learn nice tricks from the compiler, and sometimes you'll see it generate less efficient code than you could have written by hand.
This is a problem:
mov rax,4 ; System call number (sys_write)
int 0x80 ; Call kernel
Although 64bit processes can use the i386 int 0x80 system call ABI, it is the 32bit ABI, with only 32bit pointers and so on. You will have a problem as soon as you go to write(2) a char array that's on the stack (since amd64 Linux processes start with a stack pointer that has the high bits set. Heap memory, and .data and .rodata memory mapped from the executable are mapped into the lower 32b of address space.)
The native amd64 ABI uses syscall, and the system call numbers aren't the same as the i386 ABI. I found this table of syscalls listing the number and which parameter goes in which register. sys/syscall.h eventually includes /usr/include/x86_64-linux-gnu/asm/unistd_64.h to get the actual #define __NR_write 1 macros, and so on. There are standard rules for mapping arguments in order to registers. (Given in the ABI doc, IIRC).

Unusual output from GCC/LD with custom __start

As an extension of this question: GCC compile and link raw output
I am trying to compile and link a piece of code with a custom __start. As a note, I do NOT require this to work on any known architecture, so compliance with any specification is not important, getting it to work consistently is.
I have a simple piece of assembly (which I got from a URL I can't find now).
.set noreorder /* so we can use delay slots explicitly */
.text
.globl main
.globl __start
.type __start,#function
.ent __start
__start:
jal main;
nop;
li $0,1;
.end __start
If I understand this correctly, all this does is call my main method, do a no-op in the branch-delay slot, then write the number 1 to register 0 (I know this violates the MIPS specification, it is intentional - it denotes completion of the code and is "caught" before it actually occurs).
However, when I use the mips ld to link this with an example piece of code using this command mips-linux-gnu-ld --section-start=.text=0 start.o main.o -o executable
I get some unusual output when viewed with objdump
00000000 <.pic.main>:
0: 3c190000 lui t9,0x0
4: 0800022b j 8ac <main>
8: 273908ac addiu t9,t9,2220
c: 00000000 nop
00000010 <__start>:
10: 0c000000 jal 0 <.pic.main>
14: 00000000 nop
18: 24000001 li zero,1
1c: 00000000 nop
.........
000008ac <main>
.........
No matter how trivial my test program, I always get the same .pic.main function. However, in some cases it appears above __start and in some cases below.
I would like to remove this "function" entirely, but failing that would like it to always appear AFTER the __start.
As a bonus, if anyone knows what this function is or why it occurs, I'd be intrigued.
It looks like a position-independent jumping code. The linker doesn't know where your things are going to be put, so it creates a PIC for all cases. A relative jump, or a jump using a register could solve the problem, although it wouldn't be the jump and link.
I would try using
-mrelax-pic-calls to turn PIC calls that are normally dispatched via register $t9 into direct calls. This is only possible if the linker can resolve the destination at link-time and if the destination is within range for a direct call.
mbranch-cost=num to set the cost of branches to roughly num “simple” instructions. "This cost is only a heuristic and is not guaranteed to produce consistent results across releases."
-mno-shared for not to generate code that is fully position-independent, and that can therefore be linked into shared libraries
-mno-embedded-pic
I'd put my money on one the first two.

Resources