segmentation fault with .text .data and main (main in .data section) - arrays

I'm just trying to load the value of myarray[0] to eax:
.text
.data
# define an array of 3 words
array_words: .word 1, 2, 3
.globl main
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
But when I run this, I keep getting seg fault.
Could someone please point out what I did wrong here?

It seems the label main is in the .data section.
It leads to a segmentation fault on systems that doesn't allow to execute code in the .data section. (Most modern systems map .data with read + write but not exec permission.)
Program code should be in the .text section. (Read + exec)
Surprisingly, on GNU/Linux systems, hand-written asm often results in an executable .data unless you're careful to avoid that, so this is often not the real problem: See Why data and stack segments are executable? But putting code in .text where it belongs can make some debugging tools work better.
Also you need to ret from main or call exit (or make an _exit system call) so execution doesn't fall off the end of main into whatever bytes come next. See What happens if there is no exit system call in an assembly program?

You need to properly terminate your program, e.g. on Linux x86_64 by calling the sys_exit system call:
...
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
mov $60, %rax # System-call "sys_exit"
mov $0, %rdi # exit code 0
syscall
Otherwise program execution continues with the memory contents following your last instruction, which are most likely in all cases invalid instructions (or even invalid memory locations).

Related

Calling the C-function _printf from NASM causes a Segmentation Fault

I've been trying to learn 64-bit assembly on both Mac-OS and Windows using NASM.
My code is
extern _printf
section .data
msg db "Hello World!", 10, 0
section .text
global _main
_main:
mov rax, 0
mov rdi, msg
call _printf
mov rax, 0x2000001
mov rdi, 0
syscall
and I compile it with
nasm -f macho64 -o main.o main.asm
gcc -o main main.o
While trying to call _printf, I got the error
Segmentation fault: 11
When I remove the call to _printf, my code runs fine.
Why does the call to _printf cause a segmentation fault?
I have found the ABI Calling Convention here, but had no luck successfully calling a C-function.
I would expect Hello World! to be printedd, however I get 'Segmentation Fault: 11' instead.
~~You need to setup a stack frame before calling _printf
TL;DR: The System V AMD64 ABI requires the stack-pointer to be 16-byte-aligned. By the time of calling _printf, the stack-pointer is misaligned by 8 bytes.
Debugging the binary with LLDB gives:
frame #0: 0x00007fff527d430a libdyld.dylib`stack_not_16_byte_aligned_error
MacOS uses the System V AMD64 ABI and therefore relies on 16-byte alignment for the stack-pointer (see this question), in short this means the stack-pointer (rsp) should always be divisible by 16 when calling a function.
By the time of calling _printf, the stack-pointer (rsp) is misaligned by 8 bytes. How did this come?
I found the answer on this page, calling the _main function pushes the return address (8 bytes) on the stack and therefore misaligns it.
My initial idea - the setup of the stack frame - pushed another address on the stack and therefore rsp was divisible by 16 again.
However an easier solution would be just sub rsp, 8 as suggested by Margaret Bloom
Change your code to:
extern _printf
section .data
msg: db "Hello World!", 10, 0
section .text
global _main
_main:
;; Fix the stack alignment
sub rsp, 8
mov rax, 0
mov rdi, msg
call _printf
mov rax, 0x2000001
mov rdi, 0
syscall
Tested on macOS 10.13.6

What parts of this HelloWorld assembly code are essential if I were to write the program in assembly?

I have this short hello world program:
#include <stdio.h>
static const char* msg = "Hello world";
int main(){
printf("%s\n", msg);
return 0;
}
I compiled it into the following assembly code with gcc:
.file "hello_world.c"
.section .rodata
.LC0:
.string "Hello world"
.data
.align 4
.type msg, #object
.size msg, 4
msg:
.long .LC0
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
movl msg, %eax
movl %eax, (%esp)
call puts
movl $0, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
.section .note.GNU-stack,"",#progbits
My question is: are all parts of this code essential if I were to write this program in assembly (instead of writing it in C and then compiling to assembly)? I understand the assembly instructions but there are certain pieces I don't understand. For instance, I don't know what .cfi* is, and I'm wondering if I would need to include this to write this program in assembly.
The absolute bare minimum that will work on the platform that this appears to be, is
.globl main
main:
pushl $.LC0
call puts
addl $4, %esp
xorl %eax, %eax
ret
.LC0:
.string "Hello world"
But this breaks a number of ABI requirements. The minimum for an ABI-compliant program is
.globl main
.type main, #function
main:
subl $24, %esp
pushl $.LC0
call puts
xorl %eax, %eax
addl $28, %esp
ret
.size main, .-main
.section .rodata
.LC0:
.string "Hello world"
Everything else in your object file is either the compiler not optimizing the code down as tightly as possible, or optional annotations to be written to the object file.
The .cfi_* directives, in particular, are optional annotations. They are necessary if and only if the function might be on the call stack when a C++ exception is thrown, but they are useful in any program from which you might want to extract a stack trace. If you are going to write nontrivial code by hand in assembly language, it will probably be worth learning how to write them. Unfortunately, they are very poorly documented; I am not currently finding anything that I think is worth linking to.
The line
.section .note.GNU-stack,"",#progbits
is also important to know about if you are writing assembly language by hand; it is another optional annotation, but a valuable one, because what it means is "nothing in this object file requires the stack to be executable." If all the object files in a program have this annotation, the kernel won't make the stack executable, which improves security a little bit.
(To indicate that you do need the stack to be executable, you put "x" instead of "". GCC may do this if you use its "nested function" extension. (Don't do that.))
It is probably worth mentioning that in the "AT&T" assembly syntax used (by default) by GCC and GNU binutils, there are three kinds of lines: A line
with a single token on it, ending in a colon, is a label. (I don't remember the rules for what characters can appear in labels.) A line whose first token begins with a dot, and does not end in a colon, is some kind of directive to the assembler. Anything else is an assembly instruction.
related: How to remove "noise" from GCC/clang assembly output? The .cfi directives are not directly useful to you, and the program would work without them. (It's stack-unwind info needed for exception handling and backtraces, so -fomit-frame-pointer can be enabled by default. And yes, gcc emits this even for C.)
As far as the number of asm source lines needed to produce a value Hello World program, obviously we want to use libc functions to do more work for us.
#Zwol's answer has the shortest implementation of your original C code.
Here's what you could do by hand, if you don't care about the exit status of your program, just that it prints your string.
# Hand-optimized asm, not compiler output
.globl main # necessary for the linker to see this symbol
main:
# main gets two args: argv and argc, so we know we can modify 8 bytes above our return address.
movl $.LC0, 4(%esp) # replace our first arg with the string
jmp puts # tail-call puts.
# you would normally put the string in .rodata, not leave it in .text where the linker will mix it with other functions.
.section .rodata
.LC0:
.asciz "Hello world" # asciz zero-terminates
The equivalent C (you just asked for the shortest Hello World, not one that had identical semantics):
int main(int argc, char **argv) {
return puts("Hello world");
}
Its exit status is implementation-defined but it definitely prints. puts(3) returns "a non-negative number", which could be outside the 0..255 range, so we can't say anything about the program's exit status being 0 / non-zero in Linux (where the process's exit status is the low 8 bits of the integer passed to the exit_group() system call (in this case by the CRT startup code that called main()).
Using JMP to implement the tail-call is a standard practice, and commonly used when a function doesn't need to do anything after another function returns. puts() will eventually return to the function that called main(), just like if puts() had returned to main() and then main() had returned. main()'s caller still has to deal with the args it put on the stack for main(), because they're still there (but modified, and we're allowed to do that).
gcc and clang don't generate code that modifies arg-passing space on the stack. It is perfectly safe and ABI-compliant, though: functions "own" their args on the stack, even if they were const. If you call a function, you can't assume that the args you put on the stack are still there. To make another call with the same or similar args, you need to store them all again.
Also note that this calls puts() with the same stack alignment that we had on entry to main(), so again we're ABI-compliant in preserving the 16B alignment required by modern version of the x86-32 aka i386 System V ABI (used by Linux).
.string zero-terminates strings, same as .asciz, but I had to look it up to check. I'd recommend just using .ascii or .asciz to make sure you're clear on whether your data has a terminating byte or not. (You don't need one if you use it with explicit-length functions like write())
In the x86-64 System V ABI (and Windows), args are passed in registers. This makes tail-call optimization a lot easier, because you can rearrange args or pass more args (as long as you don't run out of registers). This makes compilers willing to do it in practice. (Because as I said, they currently don't like to generate code that modifies the incoming arg space on the stack, even though the ABI is clear that they're allowed to, and compiler generated functions do assume that callees clobber their stack args.)
clang or gcc -O3 will do this optimization for x86-64, as you can see on the Godbolt compiler explorer:
#include <stdio.h>
int main() { return puts("Hello World"); }
# clang -O3 output
main: # #main
movl $.L.str, %edi
jmp puts # TAILCALL
# Godbolt strips out comment-only lines and directives; there's actually a .section .rodata before this
.L.str:
.asciz "Hello World"
Static data addresses always fit in the low 31 bits of address-space, and executable don't need position-independent code, otherwise the mov would be lea .LC0(%rip), %rdi. (You'll get this from gcc if it was configured with --enable-default-pie to make position-independent executables.)
How to load address of function or label into register in GNU Assembler
Hello World using 32-bit x86 Linux int 0x80 system calls directly, no libc
See Hello, world in assembly language with Linux system calls? My answer there was originally written for SO Docs, then moved here as a place to put it when SO Docs closed down. It didn't really belong here so I moved it to another question.
related: A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux. The smallest binary file you can run that just makes an exit() system call. That is about minimizing the binary size, not the source size or even just the number of instructions that actually run.

assembly - mov unitialized variable?

I have a hard time understanding a piece of code.
I read the xv6 lecture at line 1054
Here is the code :
.globl entry
entry:
# Turn on page size extension for 4Mbyte pages
movl %cr4, %eax
orl $(CR4_PSE), %eax
movl %eax, %cr4
# Set page directory
movl $(V2P_WO(entrypgdir)), %eax
movl %eax, %cr3
# Turn on paging.
movl %cr0, %eax
orl $(CR0_PG|CR0_WP), %eax
movl %eax, %cr0
# Set up the stack pointer.
movl $(stack + KSTACKSIZE), %esp
# Jump to main(), and switch to executing at
# high addresses. The indirect call is needed because
# the assembler produces a PC-relative instruction
# for a direct jump.
mov $main, %eax
jmp *%eax
.comm stack, KSTACKSIZE
My question is:
How is it possible that we movl $(stack + KSTACKSIZE), %esp when stack is defined nowhere in the project, but at line 1063 as a .comm symbol and in a function that is called later and redefines the stack variable as a local one
static void
startothers(void)
{
char *stack; // THIS ONE IS A DIFFERENT BEAST, right ?
...
// Tell entryother.S what stack to use, where to enter, and what
// pgdir to use. We cannot use kpgdir yet, because the AP processor
// is running in low memory, so we use entrypgdir for the APs too.
stack = kalloc();
*(void**)(code-4) = stack + KSTACKSIZE;
*(void**)(code-8) = mpenter;
*(int**)(code-12) = (void *) v2p(entrypgdir);
?
I may miss a trick, but I don't get when its address is set.
At the linking stage so that stack is actually defined ?
Thanks
Yes .comm defines and allocates the stack with the given STACKSIZE in the .bss section. Upon first exeuction, the code runs as-is, and uses that stack. Judging from the function name of startothers I assume this is a multiprocessor bootup. Once the initial cpu has been brought up, it allocates a new stack for each other processor, and modifies the code itself so that it uses the newly allocated one.
In my opinion it would be a lot less confusing if the entry used variables for these things.

Assembler / GAS / Linux x86_64 - error while reading a file

I am writing a simple program in assembler on Linux x86_64 (GAS syntax). I have to read a number that coded in binary system and saved in a text file. So, I have my text file "data.txt" (it's in the same directory as my source file) and below is the most important fragment of my code:
SYS_WRITE = 4
EXIT_SUCCESS = 0
SYS_READ = 3
SYS_OPEN = 5
.data
BIN_LEN = 24
.comm BIN, BIN_LEN
BIN: .space BIN_LEN, 0
.text
PATH: .ascii "data.txt\0"
.global _start
_start:
mov $SYS_OPEN, %eax # open
mov $PATH, %ebx # path
mov $0, %ecx # read only
mov $0666, %edx # mode
int $0x80 # call (open file)
mov $SYS_READ, %eax # reading
mov $3, %ebx # descriptor
mov $BIN, %ecx # bufor
mov $BIN_LEN, %edx # bufor size
int $0x80 # call (read line from file)
After calling the second syscall, the %eax register should contain the number of read bytes.
In my file "data.txt" I have "10101", but when I debug my program with gdb, it shows that the is -11 in %eax, so there was some kind of an error. But I am sure that "10101" was loaded to the buffer (BIN), because when I want to display what the buffer has inside, there is properly written number from the file. I need the number of read bytes to the further algorithm. I have no idea why %eax contains error code instead of the number of bytes loaded to the buffer. I wonder if it may be connected with calling syscall with 32-bit registers, but in all other cases it works properly.
Please, help me.
I entered your code and compiled it on my x64 running fedora 20 using the as and ld 32 bit options to assemble and link it, and it ran perfectly, placing 0x18 into the %eax reg after syscall. If you solved the problem I would like to know what caused it and how you fixed it.
cheers

Can some one write assembly code for the c program above that converts into machine code that is less than 100 bytes?

I want to overflow the array buffer[100] and I will be passing python script on bash shell on FreeBSD. I need machine code to pass as a string to overflow that buffer buffer[100] and make the program print its hostname to stdout.
Here is the code in C that I tried and gives the host name on the console. :
#include <stdio.h>
int main()
{
char buff[256];
gethostname(buff, sizeof(buff));
printf(""%s", buff);
return 0;
}
Here is the code in assembly that I got using gcc but is longer than I need becuase when I look for the machine code of the text section of the c program it is longer than 100 bytes and I need a machine code for the c program above that is less than 100 bytes.
.type main, #function
main:
pushl %ebp; saving the base pointer
movl %esp, %ebp; Taking a snapshot of the stack pointer
subl $264, %esp;
addl $-8, %esp
pushl $256
leal -256(%ebp), %eax
pushl %eax
call gethostname
addl $16, %esp
addl $-8, %esp
leal -256(%ebp), %eax
pushl %eax
pushl $.LCO
call printf
addl $16, %esp
xorl %eax, %eax
jmp .L6
.p2align 2, 0x90
.L6:
leave
ret
.Lfe1:
.size main, .Lfe1-main
.ident "GCC: (GNU) c 2.95.4 20020320 [FreeBSD]"
A person has already done it on another computer and he has given me the ready made machine code which is 37 bytes and he is passing it in the format below to the buffer using perl script. I tried his code and it works but he doesn't tell me how to do it.
“\x41\xc1\x30\x58\x6e\x61\x6d\x65\x23\x23\xc3\xbc\xa3\x83\xf4\x69\x36\xw3\xde\x4f\x2f\x5f\x2f\x39\x33\x60\x24\x32\xb4\xab\x21\xc1\x80\x24\xe0\xdb\xd0”
I know that he did it on a differnt machine so I can not get the same code but since we both are using exactly the same c function so the size of the machine code should be almost the same if not exactly the same. His machine code is 37 bytes which he will pass on shell to overflow the gets() function in a binary file on FreeBSD 2.95 to print the hostname on stdout. I want to do the same thing and I have tried his machine code and it works but he will not tell me how did he get this machine code. So I am concerned actually about the procedure of getting that code.
OK I tried the methods suggested in the posts here but just for the function gethostname() I got a 130 character of machine code. It did not include the printf() machine code. As I need to print the hostname to console so that should also be included but that will make the machine code longer. I have to fit the code in an array of 100 bytes so the code should be less than 100 bytes.
Can some one write assembly code for the c program above that converts into machine code that is less than 100 bytes?
To get the machine code, you need to compile the program then disassemble. Using gcc for example do something like this:
gcc -o hello hello.c
objdump -D hello
The dump will show the machine code in bytes and the disassembly of that machine code.
A simple example, that is related, you have to understand the difference between an object file and an executable file but this should still demonstrate what I mean:
unsigned int myfun ( unsigned int x )
{
return(x+5);
}
gcc -O2 -c -o hello.o hello.c
objdump -D hello.o
Disassembly of section .text:
00000000 <myfun>:
0: e2800005 add r0, r0, #5
4: e12fff1e bx lr
FreeBSD is an operating system, not a compiler or assembler.
You want to assemble the assembly source into machine code, so you should use an assembler.
You can typically use GCC, since it's smart enough to know that for a filename ending in .s, it should run the assembler.
If you already have the code in an object file, you can use objdump to read out the code segment of the file.
The 37 bytes posted are completely junk.
If run under any version of Windows ( windows 2000 or later ), I believe, that
the "outsb" and "insd" instructions (in an userland program) will cause a fault,
because userland programs are not allowed directly doing port -level I/O.
Since machine code will not end in "vacuum", I added some \x90 -bytes (again NOP) after the posted code. That merely affects the argument of the last rcl -instruction (which in the given code ends prematurely; eg the code posted is not only rubbish, but also ends prematurely).
But, microprocessors do not have their own intelligence, so they will (try to) execute whatever junk code you feed them. And, the code starts with "inc ecx", a stupid move since we do not know what value the ecx had before. Also "shl dword ptr [eax],$58" is a "good"
way to randomly corrupt memory (since value if eax is also unknown).
And, one of them is NOT even valid byte (should be represented as two hexadecimal digits).
The invalid "byte" is \xw3.
I replaced that invalid byte as \x90 ( a NOP, if it is at start of instruction), and got:
00451B51 41 inc ecx
00451B52 C13058 shl dword ptr [eax],$58
00451B55 6E outsb
00451B56 61 popad
00451B57 6D insd
00451B58 652323 and esp,gs:[ebx]
00451B5B C3 ret
// code below is NEVER executed, since the line above does a RET.
00451B5C BCA383F469 mov esp,$69f483a3
00451B61 3690 nop // 36, w3 ????
00451B63 DE4F2F fimul word ptr [edi+$2f]
00451B66 5F pop edi
00451B67 2F das
00451B68 3933 cmp [ebx],esi
00451B6A 60 pushad
00451B6B 2432 and al,$32
00451B6D B4AB mov ah,$ab
00451B6F 21C1 and ecx,eax
00451B71 8024E0DB and byte ptr [eax],$db
00451B75 D09090909090 rcl [eax-$6f6f6f70],1
You get a nice hexdump of the text section of your object file with objdump -s -j .text.
Edited some more details:
You need to find out what the address of the function in your object code is. This is what objdump -t is for. In this case I am looking for the function main in a program "hello".
> objdump -t hello|grep main
> 0000000000400410 g F .text 000000000000002f main
Now I create a hexdump with objdump -s -j .text hello:
400410 4881ec08 010000be 00010000 31c04889 H...........1.H.
400420 e7e8daff ffff4889 e6bff405 400031c0 ......H.....#.1.
400430 e8abffff ff31c048 81c40801 0000c390 .....1.H........
400440 31ed4989 d15e4889 e24883e4 f0505449 1.I..^H..H...PTI
400450 c7c0e005 400048c7 c1500540 0048c7c7 ....#.H..P.#.H..
...
The first row are the addresses. It starts with 400410, the address of the main function, but this may not always be the case. The following 4 rows are 16 bytes of machinecode in hex, the last row are the same 16 bytes of machine code in ASCII. Because a lot of bytes have no representation in ASCII, there are a lot of dots. You need to use the 4 hexadecimal colums: \x48 \x81 \xec....
I have done this on a linux system, but for FreeBSD you can do exactly the same - only the resulting machindecode will be different.

Resources