I'm writing a simple program that converts brainfuck code into x86_64 assembly. Part of that involves creating a large zero-initialized array at the beginning of the program. Thus, each compiled program starts with the following assembly code:
.data
ARR:
.space 32430
.text
.globl _start
.type _start, #function
_start:
... #code as compiled from the brainfuck program
...
From there the compiled program is supposed to be able to access any part of that array, but it should segfault if it tries to access memory before or after it.
Because the array is followed directly by a .text section, which by my understanding is read only, and because it is the first section of the program, I expected that my desired behavior would follow naturally. Unfortunately, this is not the case: compiled programs are able to access non-zero initialized data to the left of (that is, at lower addresses than) the beginning of the array.
Why is this the case and is there anything I can include in the assembly code that would prevent it?
This is, of course, highly system-dependent, but since your observations suit a typical Linux/GNU system, I'll refer to such a system.
what I assume is that the linker isn't putting my segments where I think it is.
True, the linker puts the segments not in the order they appear in your code snippet, but rather .text first, .data second. We can see this e. g. with
> objdump -h ARR
ARR: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000042 08048074 08048074 00000074 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00007eae 080490b8 080490b8 000000b8 2**2
CONTENTS, ALLOC, LOAD, DATA
compiled programs are able to access non-zero initialized data to the left of (that is, at lower addresses than) the beginning of the array.
Why is this the case …
As we also see in the above example, the .data section is linked at memory address 080490b8. Although memory pages have the length PAGE_SIZE (here getconf PAGE_SIZE yields 4096, i. e. 100016) and start at multiples of that size, the data starts at an address offset equal to the file offset 000000b8 (where the data is stored in the disk file), because the file pages containing the .data section are mapped into memory as copy-on-write pages. The non-zero initialized data below the .data section is just what happens to be in the first file page at bytes 0 to b716, including .text.
… is there anything I can include in the assembly code that would prevent it?
I'd prefer a solution that places my segments such that a bad array access causes a segfault.
As Margaret Bloom and Ped7g hinted at, you could allocate additional data below ARR and create an inaccessible guard page. This can be achieved with minimal effort by aligning ARR to the next page address. The example program below implements this and allows to test it by accepting an index argument (optionally negative) with which the ARR data is accessed; if within bounds, it should exit with status 0, otherwise segfault. Note: This method works only if the .text section does not end at a page boundary, because if it does, the .align 4096 is without effect; but since the assembly code is created with a converter program, that program should be able to check this and add a few extra .text bytes if needed.
.data
.align 4096
ARR:
.space 30000 # we'll actually get 32768
.text
.globl _start
.type _start, #function
_start:
mov (%esp),%ebx # argc
cmp $1,%ebx
jbe 9f
mov $0,%ax
mov $1,%ebx # sign 1
mov 8(%esp),%esi # argv[1]
0: movb (%esi),%cl # convert argument string to integer
jcxz 1f
sub $'0',%cl
js 2f
mov $10,%dx
mul %dx
add %cx,%ax
jmp 3f
2: neg %ebx # change sign
3: add $1,%esi
jmp 0b
1: mul %ebx # multiply with sign 1 or -1
movzx ARR(%eax),%ebx# load ARR[atoi(argv[1])]
9: mov $1,%eax
int $128 # _exit(ebx);
Related
I'm just trying to load the value of myarray[0] to eax:
.text
.data
# define an array of 3 words
array_words: .word 1, 2, 3
.globl main
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
But when I run this, I keep getting seg fault.
Could someone please point out what I did wrong here?
It seems the label main is in the .data section.
It leads to a segmentation fault on systems that doesn't allow to execute code in the .data section. (Most modern systems map .data with read + write but not exec permission.)
Program code should be in the .text section. (Read + exec)
Surprisingly, on GNU/Linux systems, hand-written asm often results in an executable .data unless you're careful to avoid that, so this is often not the real problem: See Why data and stack segments are executable? But putting code in .text where it belongs can make some debugging tools work better.
Also you need to ret from main or call exit (or make an _exit system call) so execution doesn't fall off the end of main into whatever bytes come next. See What happens if there is no exit system call in an assembly program?
You need to properly terminate your program, e.g. on Linux x86_64 by calling the sys_exit system call:
...
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
mov $60, %rax # System-call "sys_exit"
mov $0, %rdi # exit code 0
syscall
Otherwise program execution continues with the memory contents following your last instruction, which are most likely in all cases invalid instructions (or even invalid memory locations).
How exactly do you make multiple global arrays of bytes and multiple dwords in x86? Also, how do you initialize them all to 0? Would this be done in _start: or in section .data? The assembler to be used is NASM.
Nasm syntax - 100 bytes / dwords:
section .bss
byte_array resb 100
dword_array resd 100
; or...
section .data
byte_array_2 times 100 db 0
dword_array_2 times 100 dd 0
global _start ; ask Nasm to tell linker about this
section .text
_start:
; do something intelligent...
section .bss is nominally "uninitialized", but is in fact initialized to zero on any reasonable platform. Fasm uses rb and rd instead of resb and resd. You didn't say "what assembler".
static int func_name (const uint8_t * address)
{
int result;
asm ("movl $1f, %0; movzbl %1, %0; 1:"
: "=&a" (result) : "m" (*address));
return result;
}
I have gone through inline assembly references over internet.
But i am unable to figure out what this code is doing, eg. what is $1f ?
And what does "m" means? Isn't the normal inline convention to use "=r" and "r" ?
The code is functionally identical to return *address but not absolutely equivalent to this wrt. to the generated binary / object file.
In ELF, the usage of the forward reference (i.e. the mov $1f, ... to retrieve the address of the assembly local label) results in the creation of what's called a relocation. A relocation is an instruction to the linker (either at executable creation or later to the dynamic linker at executable/library loading) to insert a value only known at link/load time. In the object code, this looks like:
Disassembly of section .text:
0000000000000000 :
0: b8 00 00 00 00 mov $0x0,%eax
5: 0f b6 07 movzbl (%rdi),%eax
8: c3 retq
Notice the value (at offset 1 into the .text section) is zero here even though that's actually not correct - it depends on where in the running code the function will end up. Only the (dynamic) linker can ultimately know this, and the information that this piece of memory needs to be updated as it is loaded is actually placed into the object file:
$ readelf -a xqf.o
ELF Header:
[ ... ]
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
0000000000000009 0000000000000000 AX 0 0 16
[ 2] .rela.text RELA 0000000000000000 000004e0
0000000000000018 0000000000000018 10 1 8
[ ... ]
Relocation section '.rela.text' at offset 0x4e0 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000001 00020000000a R_X86_64_32 0000000000000000 .text + 8
[ ... ]
This ELF section entry says:
look at offset 1 into the .text section
there's a 32bit value that will be zero-extended to 64-bit (R_X86_64_32). This may have been intended for use in 32-bit code, but in a 64-bit non-PIE executable that's still the most efficient way to put an address into a register; smaller than lea 1f(%rip), %0 for a R_X86_64_PC32 RIP-relative relocation. And yes a RIP-relative LEA into a 32-bit register is legal, and saves a byte of machine code if you don't care about truncating the address.
the value you (as the linker) need to put there is that of .text + 8 (which will have to be computed at link / load time)
This entry is created thanks to the mov $1f, %0 instruction. If you leave that out (or just write return *address), it won't be there.
I've forced code generation for the above by removing the static qualifier; without doing so, a simple compile actually creates no code at all (static code gets eliminated if not used, and, a lot of the time, inlined if used).
Due to the fact that the function is static, as said, it'll normally be inlined at the call site by the compiler. The information where it's used therefore usually gets lost, as does the ability of a debugger to instrument it. But the trick shown here can recover this (indirectly), because there will be one relocation entry created per use of the function. In addition to that, methods like this can be used to establish instrumentation points within the binary; insert well-known/strictly-defined but functionally-meaningless small assembly statements at locations recoverable through the object file format, and then let e.g. the debugger / tracing utilities replace them with "more useful" things when needed.
$1f is the address of the 1 label. The f specifies to look for the first label named 1 in the forward direction. "m" is an input operand that is in memory. "=&a" is an output operand that uses the eax register. a specifies the register to use, = makes it an output operand, and & guarantees that other operands will not share the same register.
Here, %0 will be replaced with the first operand (the eax register) and %1 by the second operand (The address pointed to by address).
All these and more are explained in the GCC documentation on Inline assembly and asm contraints.
This piece of code (apart from being non-compilable due to two typos) is hardly useful.
This is what it turns into (use the -S switch):
_func_name:
movl 4(%esp), %edx ; edx = the "address" parameter
movl $1f, %eax ; eax = the address of the "1" label
movzbl (%edx), %eax; eax = byte from address in edx, IOW, "*address"
1:
ret
So the entire body of the function can be replaced with just
return *address;
This is a code snippet from the PintOS project.
The function here is used by the OS kernel to read a byte at address from the user address space. That is done by movzbl %1, %0 where 0% is result and 1% is address. But before that, the kernel has to move the address of $1f(which is the address of the instruction right after movzbl %1, %0) to the eax register. This move seems useless because some context information is missing. The kernel does that for the page fault interrupt handler to use it. Because address could be an invalid one offered by the user, and it might cause a page fault. When that happened, the interrupt handler would take over, set eip equal to eax(which is the memory address of $1f), and also set eax to -1 to indicate that the read failed. After that, the kernel was able to return from the handler to $1f and move on. Without saving the address of $1f, the handler would have no idea where it should return to, and could only go back to movzbl %1, %0 again and again.
I want to overflow the array buffer[100] and I will be passing python script on bash shell on FreeBSD. I need machine code to pass as a string to overflow that buffer buffer[100] and make the program print its hostname to stdout.
Here is the code in C that I tried and gives the host name on the console. :
#include <stdio.h>
int main()
{
char buff[256];
gethostname(buff, sizeof(buff));
printf(""%s", buff);
return 0;
}
Here is the code in assembly that I got using gcc but is longer than I need becuase when I look for the machine code of the text section of the c program it is longer than 100 bytes and I need a machine code for the c program above that is less than 100 bytes.
.type main, #function
main:
pushl %ebp; saving the base pointer
movl %esp, %ebp; Taking a snapshot of the stack pointer
subl $264, %esp;
addl $-8, %esp
pushl $256
leal -256(%ebp), %eax
pushl %eax
call gethostname
addl $16, %esp
addl $-8, %esp
leal -256(%ebp), %eax
pushl %eax
pushl $.LCO
call printf
addl $16, %esp
xorl %eax, %eax
jmp .L6
.p2align 2, 0x90
.L6:
leave
ret
.Lfe1:
.size main, .Lfe1-main
.ident "GCC: (GNU) c 2.95.4 20020320 [FreeBSD]"
A person has already done it on another computer and he has given me the ready made machine code which is 37 bytes and he is passing it in the format below to the buffer using perl script. I tried his code and it works but he doesn't tell me how to do it.
“\x41\xc1\x30\x58\x6e\x61\x6d\x65\x23\x23\xc3\xbc\xa3\x83\xf4\x69\x36\xw3\xde\x4f\x2f\x5f\x2f\x39\x33\x60\x24\x32\xb4\xab\x21\xc1\x80\x24\xe0\xdb\xd0”
I know that he did it on a differnt machine so I can not get the same code but since we both are using exactly the same c function so the size of the machine code should be almost the same if not exactly the same. His machine code is 37 bytes which he will pass on shell to overflow the gets() function in a binary file on FreeBSD 2.95 to print the hostname on stdout. I want to do the same thing and I have tried his machine code and it works but he will not tell me how did he get this machine code. So I am concerned actually about the procedure of getting that code.
OK I tried the methods suggested in the posts here but just for the function gethostname() I got a 130 character of machine code. It did not include the printf() machine code. As I need to print the hostname to console so that should also be included but that will make the machine code longer. I have to fit the code in an array of 100 bytes so the code should be less than 100 bytes.
Can some one write assembly code for the c program above that converts into machine code that is less than 100 bytes?
To get the machine code, you need to compile the program then disassemble. Using gcc for example do something like this:
gcc -o hello hello.c
objdump -D hello
The dump will show the machine code in bytes and the disassembly of that machine code.
A simple example, that is related, you have to understand the difference between an object file and an executable file but this should still demonstrate what I mean:
unsigned int myfun ( unsigned int x )
{
return(x+5);
}
gcc -O2 -c -o hello.o hello.c
objdump -D hello.o
Disassembly of section .text:
00000000 <myfun>:
0: e2800005 add r0, r0, #5
4: e12fff1e bx lr
FreeBSD is an operating system, not a compiler or assembler.
You want to assemble the assembly source into machine code, so you should use an assembler.
You can typically use GCC, since it's smart enough to know that for a filename ending in .s, it should run the assembler.
If you already have the code in an object file, you can use objdump to read out the code segment of the file.
The 37 bytes posted are completely junk.
If run under any version of Windows ( windows 2000 or later ), I believe, that
the "outsb" and "insd" instructions (in an userland program) will cause a fault,
because userland programs are not allowed directly doing port -level I/O.
Since machine code will not end in "vacuum", I added some \x90 -bytes (again NOP) after the posted code. That merely affects the argument of the last rcl -instruction (which in the given code ends prematurely; eg the code posted is not only rubbish, but also ends prematurely).
But, microprocessors do not have their own intelligence, so they will (try to) execute whatever junk code you feed them. And, the code starts with "inc ecx", a stupid move since we do not know what value the ecx had before. Also "shl dword ptr [eax],$58" is a "good"
way to randomly corrupt memory (since value if eax is also unknown).
And, one of them is NOT even valid byte (should be represented as two hexadecimal digits).
The invalid "byte" is \xw3.
I replaced that invalid byte as \x90 ( a NOP, if it is at start of instruction), and got:
00451B51 41 inc ecx
00451B52 C13058 shl dword ptr [eax],$58
00451B55 6E outsb
00451B56 61 popad
00451B57 6D insd
00451B58 652323 and esp,gs:[ebx]
00451B5B C3 ret
// code below is NEVER executed, since the line above does a RET.
00451B5C BCA383F469 mov esp,$69f483a3
00451B61 3690 nop // 36, w3 ????
00451B63 DE4F2F fimul word ptr [edi+$2f]
00451B66 5F pop edi
00451B67 2F das
00451B68 3933 cmp [ebx],esi
00451B6A 60 pushad
00451B6B 2432 and al,$32
00451B6D B4AB mov ah,$ab
00451B6F 21C1 and ecx,eax
00451B71 8024E0DB and byte ptr [eax],$db
00451B75 D09090909090 rcl [eax-$6f6f6f70],1
You get a nice hexdump of the text section of your object file with objdump -s -j .text.
Edited some more details:
You need to find out what the address of the function in your object code is. This is what objdump -t is for. In this case I am looking for the function main in a program "hello".
> objdump -t hello|grep main
> 0000000000400410 g F .text 000000000000002f main
Now I create a hexdump with objdump -s -j .text hello:
400410 4881ec08 010000be 00010000 31c04889 H...........1.H.
400420 e7e8daff ffff4889 e6bff405 400031c0 ......H.....#.1.
400430 e8abffff ff31c048 81c40801 0000c390 .....1.H........
400440 31ed4989 d15e4889 e24883e4 f0505449 1.I..^H..H...PTI
400450 c7c0e005 400048c7 c1500540 0048c7c7 ....#.H..P.#.H..
...
The first row are the addresses. It starts with 400410, the address of the main function, but this may not always be the case. The following 4 rows are 16 bytes of machinecode in hex, the last row are the same 16 bytes of machine code in ASCII. Because a lot of bytes have no representation in ASCII, there are a lot of dots. You need to use the 4 hexadecimal colums: \x48 \x81 \xec....
I have done this on a linux system, but for FreeBSD you can do exactly the same - only the resulting machindecode will be different.
I have written the following code, can you explain me what does the assembly tell here.
typedef struct
{
int abcd[5];
} hh;
void main()
{
printf("%d", ((hh*)0)+1);
}
Assembly:
.file "aa.c"
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
save %sp, -112, %sp
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
mov 20, %o1
call printf, 0
nop
return %i7+8
nop
.size main, .-main
.ident "GCC: (GNU) 4.2.1"
Oh wow, SPARC assembly language, I haven't seen that in years.
I guess we go line by line? I'm going to skip some of the uninteresting boilerplate.
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
This is the string constant you used in printf (so obvious, I know!) The important things to notice are that it's in the .rodata section (sections are divisions of the eventual executable image; this one is for "read-only data" and will in fact be immutable at runtime) and that it's been given the label .LLC0. Labels that begin with a dot are private to the object file. Later, the compiler will refer to that label when it wants to load the address of the string constant.
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
.text is the section for actual machine code. This is the boilerplate header for defining the global function named main, which at the assembly level is no different from any other function (in C -- not necessarily so in C++). I don't remember what .proc 020 does.
save %sp, -112, %sp
Save the previous register window and adjust the stack pointer downward. If you don't know what a register window is, you need to read the architecture manual: http://sparc.org/wp-content/uploads/2014/01/v8.pdf.gz. (V8 is the last 32-bit iteration of SPARC, V9 is the first 64-bit one. This appears to be 32-bit code.)
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
This two-instruction sequence has the net effect of loading the address .LLC0 (that's your string constant) into register %o0, which is the first outgoing argument register. (The arguments to this function are in the incoming argument registers.)
mov 20, %o1
Load the immediate constant 100 into %o1, the second outgoing argument register. This is the value computed by ((foo *)0)+1. It's 20 because your struct foo is 20 bytes long (five 4-byte ints) and you asked for the second one within the array starting at address zero.
Incidentally, computing an offset from a pointer is only well-defined in C when there is actually a sufficiently large array at the address of the base pointer; ((foo *)0) is a null pointer, so there isn't an array there, so the expression ((foo *)0)+1 technically has undefined behavior. GCC 4.2.1, targeting hosted SPARC, happens to have interpreted it as "pretend there is an arbitrarily large array of foos at address zero and compute the expected offset for array member 1", but other (especially newer) compilers may do something completely different.
call printf, 0
nop
Call printf. I don't remember what the zero is for. The call instruction has a delay slot (again, read the architecture manual) which is filled in with a do-nothing instruction, nop.
return %i7+8
nop
Jump to the address in register %i7 plus eight. This has the effect of returning from the current function.
return also has a delay slot, which is filled in with another nop. There is supposed to be a restore instruction in this delay slot, matching the save at the top of the function, so that main's caller gets its register window back. I don't know why it's not there. Discussion in the comments talks about main possibly not needing to pop the register window, and/or your having declared main as void main() (which is not guaranteed to work with any C implementation, unless its documentation specifically says so, and is always bad style) ... but pushing and not popping the register window is such a troublesome thing to do on a SPARC that I don't find either explanation convincing. I might even call it a compiler bug.
The assembly calls printf, passing your text buffer and the number 20 on the stack (which is what you asked for in a roundabout way).