Return-To-Libc Function Address Probing - c

I'm trying to implement a return-to-libc buffer overflow attack by finding the address of system() with gdb and returning to said address with /bin/sh passed as an argument to system() on the stack. The only problem is, I can't find the address in memory where system() lives, as running print system in gdb returns "No symbol table is loaded. Use the "file" command.", which isn't especially helpful, as loading libc.so into gdb doesn't do me any good. Is there a way I can find the addresses of functions in libc which I have not included via headers?
For reference, the code I'm testing this with is below, DEP is enabled, and ASLR is disabled.
#include <stdio.h>
#include <string.h>
void foo(char *arg) {
char buf[100];
strcpy(buf, arg);
}
int main(int argc, char **argv) {
foo(argv[1]);
return 0;
}

Related

execve() has a weird behavior depending on the state of an unused variable

I've just recently been exploring the execve() system function. This code might not make much sense but that's not the main focus of this question. (I've managed to make it work correctly since, using this thread).
I've come across a really weird behavior and wanted either an explanation or a confirmation that something like this should not happen.
The "bugged" code is this:
#include <unistd.h>
int main(int argc, char **argv, char **env)
{
if (argc != 2)
return (ERROR_CODE);
char *test[] = { argv[1] };
char *a[] = { NULL };
execve(argv[1], test, env);
return (SUCCESS_CODE);
}
Compiling and executing it with an argument will correctly execute that function, in my case:
$> gcc main.c
$> ./a.out "/bin/ls"
This would work like the ls function would.
Now remove/comment this line:
char *a[] = { NULL };
This variable is clearly not used and completely useless.
Do the same steps once again and for some reason, it doesn't output anything, this one random variable breaks the code for me. (I'm running Ubuntu 20.04 with Gnome 3.36.8 and gcc 9.3.0).
If you need any more information about my OS or anything, feel free to ask.
PS: I think I understand the way the code is trying to work this out but It makes no sense to me.
$> man execve
main(int argc, char *argv[])
char *newargv[] = { NULL, "hello", "world", NULL };
...
execve(argv[1], newargv, newenviron);
The manual example null-terminates "newargv", my idea is that somehow, somewhere, the compiler decided to fuse together my variables "test" and "a", to null-terminate "test"?
Yep, you're accidentally seeing that "fusing" since you're not correctly terminating argv with a NULL and the memory layout happens to be in your favor. If you were less lucky, you'd get garbage in there, or a segfault.
Quoth the manpage (Linux, Darwin), emphasis mine,
The argument argv is a pointer to a null-terminated array of character pointers to null-terminated character strings.
#include <unistd.h>
int main(int argc, char **argv, char **env)
{
if (argc != 2)
return (ERROR_CODE);
char *test[] = { argv[1], NULL };
execve(argv[1], test, env);
return (SUCCESS_CODE);
}
would be the correct invocation.

my execv() function not working in linux ubuntu

I wrote the following code but I always get the output: "ERROR!" (the execv function not scheduled to return)
What am I doing wrong???
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <math.h>
#include <string.h>
#include <malloc.h>
#include "LineParser.h"
#define LOCATION_LEN 200
char* getL(void);
int main(int argc,char *argv[])
{
char *loc = getL();
char *args[] = {loc,"ls",NULL};
int i;
execv(args[0],args);
printf("ERROR!");
free(loc);
}
char* getL(void)
{
char *buff = (char**)malloc(sizeof(char)*LOCATION_LEN);
getcwd(buff,LOCATION_LEN);
return buff;
}
Read documentation of execv(3) and of execve(2) and of perror(3). At the very least, you should code
int main(int argc, char *argv[]) {
char *loc = getL();
char *args[] = { loc, "ls", NULL };
int i;
execv(args[0], args);
perror("execv");
free(loc);
}
You should compile with gcc -Wall -g then use the gdb debugger.
Your usage of execv is obviously wrong (you need a full path, e.g. "/bin/ls", and the order of arguments is wrong). You probably want exevcp(3) and you should in fact code at least:
char *args = { "ls", loc, NULL };
execvp("ls", args);
perror("execvp")
If you insist on using specifically execv(3) you could try
char *args = { "ls", loc, NULL };
execv("/bin/ls", args);
perror("execv")
I don't understand what your code is supposed to do. You might be interested by glob(7) & glob(3).
You probably should read Advanced Linux Programming. It seems that there are several concepts that you don't understand well enough. I guess that strace(1) could be useful to you (at least by running strace ls *.c to understand what is happening).
Maybe your getL is exactly what the GNU function get_current_dir_name(3) is doing, but then the (char**) cast inside it is grossly wrong. And you should better clear the buffer buff using memset(3) before calling getcwd(2) (and you should test against failure of ̀ mallocand ofgetcwd`)
Perhaps you want opendir(3), readdir(3), asprintf(3), stat(2); with all these, you could even avoid running ls
If you are coding some shell, you should strace some existing shell, and after having read all the references I am giving here, study the source code of free software shells like sash and GNU bash
You are not passing the correct arguments to execv. The first argument must be a path to the executable you wish to run but you are passing the path to the current working directory.
Update getL to return the full path to ls.

Pass the arguments received in C down to bash script

I have the following piece of C code that is being called with arguments:
int main(int argc, char *argv[])
{
system( "/home/user/script.sh" );
return 0;
}
how do i pass all arguments received down to script.sh?
You could synthesize some string (escaping naughty characters like quote or space when needed, like Shell related utility functions of Glib do) for system(3).
But (on Linux and Posix) you really want to call execv(3) without using system(3)
You may want to read (in addition of the man page I linked above) : Advanced Linux Programming
I think that you are looking for the execv function. It will grant to you to execute a specific file passing to it some optional arguments.
Try something next:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
system("cat /etc/passwd");
extern char * const environ[];
char * const command[] = {"mylsname", "-lR", "/", NULL};
execve("/bin/ls", command, environ);
perror("execve");
exit(EXIT_FAILURE);
}
You can use snprintf() function to frame a string. For example, snprintf(filename, sizeof(char) * 64, "/home/user/script.sh %s", argv[1]); and use system(filename);

How to change environment variable in shell executing a C program from that C program?

I want to change the value of PATH variable inside the C program and then see the changed value in the shell using which I run this program.
Doing something like this,
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main () {
char *path = getenv ("PATH");
printf ("%s\n\n", path);
setenv ("PATH", strcat (path, ":~/myNewPath/"), 1);
printf ("%s\n\n", path);
int pid = fork ();
if (pid == -1)
abort ();
if (pid == 0) {
} else {
// use execlp? how? source? any hints?
}
return 0;
}
If I use source command in the exec* system call. What will be the syntax to update this PATH variable in the shell backwards?
This is impossible. There is no way for a subprocess to change its parent's environment variables.
To understand why it is impossible, look at the signature of execve
int execve(const char *program, char *const *argv, char *const *envp);
which is paired with the true signature of main on Unix systems
int main(int argc, char **argv, char **envp);
and perhaps you begin to understand that as far as the kernel is concerned, the environment variables are a second set of command line arguments. That they appear to be independently accessible via getenv and setenv etc, and appear to inherit from parent to child, is an illusion maintained by the C library.
For more detail on how this works, study the x86-64 ELF ABI specification, section 3.4.1 "Initial Stack and Register State" paying particular attention to figure 3.9 which shows the layout of the data copied by execve onto the newly created stack. (The document linked is specific to one CPU architecture, but the way this works is generally consistent across modern Unixes; fine details will of course vary from CPU to CPU and OS to OS.)

Find program's code address at runtime?

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.
Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.
If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}
To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]
About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!

Resources