Assembly works, but shellcode does not [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have a x64 processor and I'm looking into shellcode.
I have the following code:
section .text
global _start
_start:
push rax
mov rbx, 0x68732f6e69622f2f
shr rbx, 0x8
push rbx
mov rdi, rsp
;mov rdi, com
mov al, 59
syscall
When compiled with the foolowing command:
nasm -g -f elf64 execve.asm
And linked with:
ld execve.o -o execve
It runs fine. It opens a shell. If i get the shellcode from it:
"\x50\x48\xbb\x2f\x2f\x62\x69\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\xb0\x3b\x0f\x05"
And use this C program:
int main(void)
{
const char shellcode[] = "\x50\x48\xbb\x2f\x2f\x62\x69\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\xb0\x3b\x0f\x05";
(*(void (*)()) shellcode)();
return 0;
}
Compile it:
gcc -fno-stack-protector -z execstack -o ex2 ex.c
If run it returns a segmentation fault, but it should execute a shell. Why?
Help is appreciated!

Your standalone assembly depends on registers rsi and rdx being zeroed.
(As you can see from http://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64/
sys_execve takes three arguments through registers rdi, rsi, and rdx, and Linux will also accept it if rdi is a correct filename and rsi and rdx are zero).
This may be the case when the program starts, but it's not when you run the instructions from within the
middle of a program.
Consequently if rsi, and rdx have garbage in them, the syscall will fail and the instruction stream
will continue executing the garbage bytes it'll find after the syscall instruction, eventually leading to a crash (this is indeed, what's happening in your case, as you can see if you run the program through gdb)
The simplest way to make the syscall succeed is by zeroing out 2nd and the third argument:
section .text
global _start
_start:
xor rax, rax
xor rsi, rsi ; zero 2nd argument
xor rdx, rdx ; zero 3rd argument
push rax
mov rbx, 0x68732f6e69622f2f
shr rbx, 0x8
push rbx
mov rdi, rsp
mov al, 59
syscall
Corresponding C code:
int main(void)
{
char shellcode[] =
"\x48\x31\xc0\x48\x31\xd2\x48\x31\xf6\x50\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\xb0\x3b\x0f\x05"
;
((void (*)()) shellcode)();
return 0;
}

You left out the \x6e.
This causes segv because the exec fails and there isn't a return after the syscall.

Related

Compiling C to 32-bit assembly with GCC doesn't match a book

I have been trying to compile this C program to assembly but it hasn't been working fine.
I am reading
Dennis Yurichev Reverse Engineering for Beginner but I am not getting the same output. Its a simple hello world statement. I am trying to get the 32 bit output
#include <stdio.h>
int main()
{
printf("hello, world\n");
return 0;
}
Here is what the book says the output should be
main proc near
var_10 = dword ptr -10h
push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, offset aHelloWorld ; "hello, world\n"
mov [esp+10h+var_10], eax
call _printf
mov eax, 0
leave
retn
main endp
Here are the steps;
Compile the print statement as a 32bit (I am currently running a 64bit pc)
gcc -m32 hello_world.c -o hello_world
Use gdb to disassemble
gdb file
set disassembly-flavor intel
set architecture i386:intel
disassemble main
And i get;
lea ecx,[esp+0x4]
and esp,0xfffffff0
push DWORD PTR [ecx-0x4]
push ebp
mov ebp,esp
push ebx
push ecx
call 0x565561d5 <__x86.get_pc_thunk.ax>
add eax,0x2e53
sub esp,0xc
lea edx,[eax-0x1ff8]
push edx
mov ebx,eax
call 0x56556030 <puts#plt>
add esp,0x10
mov eax,0x0
lea esp,[ebp-0x8]
pop ecx
pop ebx
pop ebp
lea esp,[ecx-0x4]
ret
I have also used
objdump -D -M i386,intel hello_world> hello_world.txt
ndisasm -b32 hello_world > hello_world.txt
But none of those are working either. I just cant figure out what's wrong. I need some help. Looking at you Peter Cordes ^^
The output from the book looks like MSVC, not GCC. GCC will definitely not ever emit main proc because that's MASM syntax, not valid GAS syntax. And it won't do stuff like var_10 = dword ptr -10h.
(And even if it did, you wouldn't see assemble-time constant definitions in disassembly, only in the compiler's asm output which is what the book suggested you look at. gcc -S -masm=intel output. How to remove "noise" from GCC/clang assembly output?)
So there are lots of differences because you're using a different compiler. Even modern versions of MSVC (on the Godbolt compiler explorer) make somewhat different asm, for example not bothering to align ESP by 16, perhaps because more modern Windows versions, or CRT startup code, already does that?
Also, your GCC is making PIE executables by default, so use -fno-pie -no-pie. 32-bit PIE sucks for efficiency and for ease of understanding. See How do i get rid of call __x86.get_pc_thunk.ax. (Also 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIE executables, mostly focused on 64-bit code)
The extra clunky stack-alignment in main's prologue is something that GCC8 optimized for functions that don't also need alloca. But it seems even current GCC10 emits the full un-optimized version when you don't enable optimization :(.
Why is gcc generating an extra return address? and Trying to understand gcc's complicated stack-alignment at the top of main that copies the return address
Optimizing printf to puts: see How to get the gcc compiler to not optimize a standard library function call like printf? and -O2 optimizes printf("%s\n", str) to puts(str). gcc -fno-builtin-printf would be one way to make that not happen, or just get used to it. GCC does a few optimizations even at -O0 that other compilers only do at higher optimization levels.
MSVC 19.10 compiles your function like this (on the Godbolt compiler explorer) with optimization disabled (the default, no compiler options).
_main PROC
push ebp
mov ebp, esp
push OFFSET $SG4501
call _printf
add esp, 4
xor eax, eax
pop ebp
ret 0
_main ENDP
_DATA SEGMENT
$SG4501 DB 'hello, world', 0aH, 00H
GCC10.2 still uses an over-complicated stack alignment dance in the prologue.
.LC0:
.string "hello, world"
main:
lea ecx, [esp+4]
and esp, -16
push DWORD PTR [ecx-4]
push ebp
mov ebp, esp
push ecx
sub esp, 4
# end of function prologue, I think.
sub esp, 12 # make sure arg will be 16-byte aligned
push OFFSET FLAT:.LC0 # push a pointer
call puts
add esp, 16 # pop the arg-passing space
mov eax, 0 # return 0
mov ecx, DWORD PTR [ebp-4] # undo stack alignment.
leave
lea esp, [ecx-4]
ret
Yes, this is super inefficient. If you called your function anything other than main, it would already assume ESP was aligned by 16 on function entry:
# GCC10.2 -m32 -O0
.LC0:
.string "hello, world"
foo:
push ebp
mov ebp, esp
sub esp, 8 # reach a 16-byte boundary, assuming ESP%16 = 12 on entry
#
sub esp, 12
push OFFSET FLAT:.LC0
call puts
add esp, 16
mov eax, 0
leave
ret
So it still doesn't combine the two sub instructions, but you did tell it not to optimize so braindead code is expected. See Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? for example.
My GCC will very eagerly swap a call to printf to puts! I did not manage to find the command line options that would make the compiler to not do this. I.e. the program has the same external behaviour but the machine code is that of
#include <stdio.h>
int main(void)
{
puts("hello, world");
}
Thus, you'll have really hard time trying to get the exact same assembly as in the book, as the assembly from that book has a call to printf instead of puts!
First of all you compile not decompile.
You get a lots of noise as you compile without the optimizations. If you compile with optimizations you will get much smaller code almost identical with the one you have (to prevent change from printf to puts you need to remove the '\n' https://godbolt.org/z/cs4qe9):
.LC0:
.string "hello, world"
main:
lea ecx, [esp+4]
and esp, -16
push DWORD PTR [ecx-4]
push ebp
mov ebp, esp
push ecx
sub esp, 16
push OFFSET FLAT:.LC0
call puts
mov ecx, DWORD PTR [ebp-4]
add esp, 16
xor eax, eax
leave
lea esp, [ecx-4]
ret
https://godbolt.org/z/xMqo33

nasm , 64 ,linux, segmentation fault core dumped

this is foo.asm
extern choose;
[section .data]
num1st dq 3
num2nd dq 4
[section .text]
global main
global myprint
main:
push qword [num2nd]
push qword [num1st]
call choose
add esp,8
mov ebx,0
mov eax,1
int 0x80
; pop qword [num1st]
; pop qword [num2nd]
myprint:
mov edx,[esp+8]
mov ecx,[esp+4]
mov ebx,1
mov eax,4
int 0x80
; pop qword [num1st]
; pop qword [num2nd]
ret
it is a C-asm-program
this is bar.c
void myprint(char * msg ,int len);
int choose(int a,int b)
{
if (a>=b){
myprint("the 1st one\n",13);}
else {
myprint("the 2nd one\n",13);}
return 0;
}
nasm -f elf64 foo.asm
gcc -c bar.c
gcc -s -o foobar bar.o foo.o
./foobar ,it says segmentation fault core dumped
I use gdb to debug ,but it says missing debuginfo-install, I am also trying to install it.
maybe the problem has sth to do with the 86_64 arch...
Segmentation fault when pushing on stack (NASM)
after watched this link,I add some 'pop' into it but it doesn't work
Arguments are not passed on the stack in 64-bit mode, unless you have more than 6 of them. The first two arguments will be in RDI and RSI.
There's also a difference in how you should use system calls in 64-bit mode. The syscall number and arguments should be placed in the following registers (source):
syscall nr rax
arg 1 rdi
arg 2 rsi
arg 3 rdx
arg 4 r10
arg 5 r9
arg 6 r8
And the sys_write syscall number in 64-bit mode is 1, not 4. Also, instead of int 0x80 you should use syscall. Performing syscalls with int 0x80 might work in 64-bit mode depending on how your kernel has been configured, but you still need to consider how function arguments are passed.

Ret illegal instruction

I'm working with a project that implements a function in assembly that is called in a main.c. The signature function declaration in C is void strrev(char *str) ; The Ret instruction is giving me an illegal instruction error. Why? This is my first time doing this.
Trying to only post the relevant code:
SECTION .text
global strrev
strrev:
push ebp
mov ebp, esp
push esi
push edi
push ebx
// doing things with al, bl, ecx, edi, and esi registers here
// restore registers and return
mov esp, ebp
pop ebx
pop edi
pop esi
pop ebp
ret
Error:
(gdb)
Program received signal SIGILL, Illegal instruction.
0xbffff49a in ?? ()
Compiling and linking this way:
nasm -f elf -g strrepl.asm
nasm -f elf -g strrev.asm
gcc -Wall -g -c main7.c
gcc -Wall -g strrepl.o strrev.o main7.o
mov esp, ebp changes esp to point to where it was when mov ebp, esp was executed. That was before you pushed esi, edi, and ebx onto the stack, so you can no longer pop them. Since you do, the stack is wrong, and the ret does not work as desired.
You can likely delete the mov esp, ebp instruction. Restoring the stack pointer like that is needed only if you have variable changes to the stack pointer in the routine (e.g., to move the stack to a desired alignment or to make space for a variable-length array). If your stack is handled simply, then you merely pop in reverse order of what you push. If you do have variable changes to the stack, then you need to restore the pointer to a different location, not the ebp you have saved, so that you can pop ebx, edi, and esi.

Segfault while calling C function (printf) from Assembly

I am using NASM on linux to write a basic assembly program that calls a function from the C libraries (printf). Unfortunately, I am incurring a segmentation fault while doing so. Commenting out the call to printf allows the program to run without error.
; Build using these commands:
; nasm -f elf64 -g -F stabs <filename>.asm
; gcc <filename>.o -o <filename>
;
SECTION .bss ; Section containing uninitialized data
SECTION .data ; Section containing initialized data
text db "hello world",10 ;
SECTION .text ; Section containing code
global main
extern printf
;-------------
;MAIN PROGRAM BEGINS HERE
;-------------
main:
push rbp
mov rbp,rsp
push rbx
push rsi
push rdi ;preserve registers
****************
;code i wish to execute
push text ;pushing address of text on to the stack
;x86-64 uses registers for first 6 args, thus should have been:
;mov rdi,text (place address of text in rdi)
;mov rax,0 (place a terminating byte at end of rdi)
call printf ;calling printf from c-libraries
add rsp,8 ;reseting the stack to pre "push text"
**************
pop rdi ;preserve registers
pop rsi
pop rbx
mov rsp,rbp
pop rbp
ret
x86_64 does not use the stack for the first 6 args. You need to load them in the proper registers. Those are:
rdi, rsi, rdx, rcx, r8, r9
The trick I use to remember the first two is to imagine the function is memcpy implemented as rep movsb,
You're calling a varargs function -- printf expects a variable number of arguments and you have to account for that in the argument stack. See here: http://www.csee.umbc.edu/portal/help/nasm/sample.shtml#printf1

C calling conventions in assembly (64 bit) - scanf

I have some assembly code that uses scanf and printf and I'm running into some problems. When both of these functions are used in the same code, the values in the registers seem to be lost. The program basically loads a number and prints it out. We run it using
nasm -f elf64 file.asm && gcc -o file file.o && ./file
on linux
Here's our code:
extern printf
extern scanf
section .data
a db "set: ", 0
b db "not set: ", 0
reading db "Please enter a number: ", 0
message db "\n", 0
printsent db "%s", 10, 0
printint db "%d", 10, 0
printchar db "%c", 10, 0
readInt db "%d", 0
input db "%d", 0
section .text
global main
main:
hatta:
push rbp,
mov rbp, rsp,
push rbx,
xor rax, rax,
mov rdi, printsent,
mov rsi, reading
call printf,
pop rbx,
xor rax, rax,
mov rdi, readInt,
call scanf,
mov rbx, rdi
push rbx,
xor rax, rax,
mov rdi, printint,
mov rsi, rbx,
call printf,
pop rbx,
pop rbp,
ret
The odd thing is that if the line mov rdi, printint, is removed, we obtain the correct values. However, if we do the same thing with printsentence, we get a segmentation fault. Could anyone tell us the reason for this?
Thanks!
There are two errors in your scanf usage, possibly based on one false assumption: You seem to mean that scanf returns the loaded number in rdi and that no further argument is needed with format "%d". In truth the number (if scanned successfully) is returned in the memory pointed to by the second argument. Thus, the following changes make the code work.
pop rbx, | ;delete
=
xor rax, rax, = xor rax, rax,
mov rdi, readInt, = mov rdi, readInt,
> mov rsi, rsp
call scanf, = call scanf,
mov rbx, rdi | pop rbx,
Concerning if the line mov rdi, printint, is removed, we obtain the correct values - I doubt that.
I don't understand why you have the C flag here, there is no C code involved, but to your question:
As far as I remember the calling convention in Linux glibc x64 for printf(format, argument) is format in rdi, argument in rsi.
If you remove mov rdi, printsent, then you are calling printf(undefined,"Please enter a number: "). You did not provide a format argument in rdi, but printf
doesn't know that and uses whatever is in rdi at that moment. Most likely an invalid memory address which thus invokes a SIGSEGV.
By default function calls in x86 are supposed to be non-destructive on the arguments, though this is not a requirement. Standard library functions generally are. They achieve this by pushing the arguments on to the stack and reload them when done.
As such when you call scanf(readInt, ...) it will restore the pointer to readInt which has the same contents as printint in rdi when it returns. As such removing the line mov rdi, printint, has no effect because rdi contains a valid pointer to the format you need.

Resources