gnu c on_exit - segmentation fault - c

Out of curiosity I am trying to get the libc on_exit function to work, but I have run into a problem with a segmentation fault. The difficulty I am having is finding an explanation of the proper use of this function. The function is defined in glibc as:
Function: int on_exit (void (*function)(int status, void *arg), void *arg)
This function is a somewhat more powerful variant of atexit. It accepts two arguments, a function and an arbitrary pointer arg. At normal program termination, the function is called with two arguments: the status value passed to exit, and the arg.
I created a small test, and I cannot find where the segmentation fault is generated:
#include <stdio.h>
#include <stdlib.h>
void *
exitfn (int stat, void *arg) {
printf ("exitfn has been run with status %d and *arg %s\n", stat, (char *)arg);
return NULL;
}
int
main (void)
{
static char *somearg="exit_argument";
int exit_status = 1;
on_exit (exitfn (exit_status, somearg), somearg);
exit (EXIT_SUCCESS);
}
Compiled with: gcc -Wall -o fn_on_exit fnc-on_exit.c
The result is:
$ ./fn_on_exit
exitfn has been run with status 1 and *arg exit_argument
Segmentation fault
Admittedly, this is probably readily apparent for seasoned coders, but I am not seeing it. What is the proper setup for use of the on_exit function and why in this case is a segmentation fault generated?

The line of code
on_exit (exitfn (exit_status, somearg), somearg);
Should be
on_exit (exitfn, somearg);
As you do not want to call the exitfn at this stage (that returns NULL!)

Related

Library interpositioning

I have been trying to intercept calls to malloc and free, following our textbook (CSAPP book).
I have followed their exact code, and nearly the same code that I found online and I keep getting a segmentation fault. I heard our professor saying something about printf that mallocs and frees memory so I think that this happens because I am intercepting a malloc and since I am using a printf function inside the intercepting function, it will call itself recursively.
However I can't seem to find a solution to solving this problem? Our professor demonstrated that intercepting worked ( he didn't show us the code) and prints our information every time a malloc occurs, so I do know that it's possible.
Can anyone suggest a working method??
Here is the code that I used and get nothing:
mymalloc.c
#ifdef RUNTIME
// Run-time interposition of malloc and free based on // dynamic linker's (ld-linux.so) LD_PRELOAD mechanism #define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h> #include <dlfcn.h>
void *malloc(size_t size) {
static void *(*mallocp)(size_t size) = NULL; char *error;
void *ptr;
// get address of libc malloc
if (!mallocp) {
mallocp = dlsym(RTLD_NEXT, "malloc"); if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(EXIT_FAILURE);
}
}
ptr = mallocp(size);
printf("malloc(%d) = %p\n", (int)size, ptr); return ptr;
}
#endif
test.c
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("main\n");
int* a = malloc(sizeof(int)*5);
a[0] = 1;
printf("end\n");
}
The result i'm getting:
$ gcc -o test test.c
$ gcc -DRUNTIME -shared -fPIC mymalloc.c -o mymalloc.so
$ LD_PRELOAD=./mymalloc.so ./test
Segmentation Fault
This is the code that I tried and got segmentation fault (from https://gist.github.com/iamben/4124829):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
void* malloc(size_t size)
{
static void* (*rmalloc)(size_t) = NULL;
void* p = NULL;
// resolve next malloc
if(!rmalloc) rmalloc = dlsym(RTLD_NEXT, "malloc");
// do actual malloc
p = rmalloc(size);
// show statistic
fprintf(stderr, "[MEM | malloc] Allocated: %lu bytes\n", size);
return p;
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define STR_LEN 128
int main(int argc, const char *argv[])
{
char *c;
char *str1 = "Hello ";
char *str2 = "World";
//allocate an empty string
c = malloc(STR_LEN * sizeof(char));
c[0] = 0x0;
//and concatenate str{1,2}
strcat(c, str1);
strcat(c, str2);
printf("New str: %s\n", c);
return 0;
}
The makefile from the git repo didn't work so I manually compiled the files and got:
$ gcc -shared -fPIC libint.c -o libint.so
$ gcc -o str str.c
$ LD_PRELOAD=./libint.so ./str
Segmentation fault
I have been doing this for hours and I still get the same incorrect result, despite the fact that I copied textbook code. I would really appreciate any help!!
One way to deal with this is to turn off the printf when your return is called recursively:
static char ACallIsInProgress = 0;
if (!ACallIsInProgress)
{
ACallIsInProgress = 1;
printf("malloc(%d) = %p\n", (int)size, ptr);
ACallIsInProgress = 0;
}
return ptr;
With this, if printf calls malloc, your routine will merely call the actual malloc (via mallocp) and return without causing another printf. You will miss printing information about a call to malloc that the printf does, but that is generally tolerable when interposing is being used to study the general program, not the C library.
If you need to support multithreading, some additional work might be needed.
The printf implementation might allocate a buffer only once, the first time it is used. In that case, you can initialize a flag that turns off the printf similar to the above, call printf once in the main routine (maybe be sure it includes a nice formatting task that causes printf to allocate a buffer, not a plain string), and then set the flag to turn on the printf call and leave it set for the rest of the program.
Another option is for your malloc routine not to use printf at all but to cache data in a buffer to be written later by some other routine or to write raw data to a file using write, with that data interpreted and formatted by a separate program later. Or the raw data could be written by a pipe to a program that formats and prints it and that is not using your interposed malloc.

How do calls to the execvp system call work?

I was doing a research into the contents of another StackOverflow question and I thought it was a good time to brush up my knowledge of unix system calls.
While experimenting with execvp (WITHOUT fork on purpose) I ran into something that confuses me
I wrote 4 test programs
Program 1
#include <stdio.h>
int main() {
//printf("Doge\n");
execvp("ls");
printf("Foo\n");
return 0;
}
The program works as expected, the contents of the directory are printed and the Foo print statement is not
Program 2
However when I uncomment the first print statement and have the program be this
#include <stdio.h>
int main() {
printf("Doge\n");
execvp("ls");
printf("Foo\n");
return 0;
}
execvp returns a -1 and both print statements are issued. why?
Program 3
I vaguely remember having to use unistd.h when experimenting with unix system calls from college.
So I included it, but not execvp has a different signature and it needed some more args than just the name of the program. So I did this
#include <stdio.h>
#include <unistd.h>
int main() {
printf("Doge\n");
char *const parmList[] = {"ls", NULL};
execvp("ls", parmList);
printf("Foo\n");
return 0;
}
And this works. This has confused me. Why did exec work in the first program?
I also used This as a reference to the system calls.
Finally I wrote
Program 4
#include <stdio.h>
//#include <unistd.h>
int main() {
printf("Doge\n");
char *const parmList[] = {"ls", NULL};
execvp("ls", parmList);
printf("Foo\n");
return 0;
}
Which also works as expected.
Can someone explain what's going on?
With this snippet
#include <stdio.h>
int main() {
execvp("ls");
printf("Foo\n");
return 0;
}
you're invoking undefined behaviour. You're not providing the prototype for execvp which requires an argument list (null terminated) as a second parameter.
Using gcc without any warning option silently uses execvp as implicitly declared, and doesn't check parameters. It just calls the function. The function then looks for a second parameter and encounters... whatever is left of the call stack (or registers, depending on call conventions), that's why a previous printf call can change the behaviour.
Using gcc -Wall gives the following warning:
test.c:5:9: warning: implicit declaration of function 'execvp' [-Wimplicit-function-declaration]
execvp("ls");
Including the proper include (#include <unistd.h>) leads to:
test.c:6:9: error: too few arguments to function 'execvp'
execvp("ls");
^~~~~~
That's why you've got strange behaviour. Don't look further. Use execvp with 2 arguments, period. In your case "Program 3" is the way to go, and always set warning level to the maximum, if possible (gcc and clang: -Wall -Wextra -pedantic -Werror)

Print "Hello world" before main() function in C

The following program copied from Quora, that print "Hello world" before main() function.
#include <stdio.h>
#include <unistd.h>
int main(void)
{
return 0;
}
void _start(void)
{
printf ("hello, world\n");
int ret = main();
_exit (ret);
}
Then, I compiled above program on Ubuntu-14.04 GCC compiler using following command
gcc -nostartfiles hello.c
And ran a.out executable file, But I got Segmentation fault (core dumped)? So, Why Segmentation fault?
_start is the real entrypoint of the executable, that is normally taken by the C runtime to initialize its stuff - including stdio -, call functions marked with the constructor attribute and then call your main entrypoint. If you take it and try to use stuff from the standard library (such as printf) you are living dangerously, because you are using stuff that hasn't been initialized yet.
What you can do, however, is to bypass the C runtime completely, and print using a straight syscall, such as write.

segmentation fault by Special functions _init and _fini

For an observation purpose, I wrote a program using _start(), _init(), _fini(), goal is to not to use startfiles.
the code is as follows
#include <stdio.h>
void test()
{
printf("\n%s: \n",__func__);
printf("library test routine invoked\n");
int a=3,b=2;
int sum=a+b;
printf("sum=%d\n",sum);
getchar();
_fini();
}
int _start()
{
printf("\n%s: \n",__func__);
printf("in library start routine\n");
test();
return 0;
}
int _init()
{
printf("\n%s: \n",__func__);
printf("in library init routine\n");
return 0;
}
int _fini()
{
printf("\n%s: \n",__func__);
printf("in library fini routine\n");
return 0;
}
complied with
gcc -nostartfiles test.c -o test
and the output is
_start:
in library start routine
test:
library test routine invoked
sum=5
l
_fini:
in library fini routine
Segmentation fault (core dumped)
Here I want to know why the executable gave segmentation fault?? Do I need to specify as it is end of the program?? If so, how??
What can be done to overcome the segmentation fault??
Another question is that these _start(),_init(),_fini() are only used when dealing with libraries???
Please
The _start routine cannot return. Normally, it calls __libc_start_main which calls main. Then when main returns, __libc_start_main calls exit with the return value of main.
Since you're defining _start yourself and not calling __libc_start_main, you need to explicitly call exit. You're getting a sigfault because that function is not expected to return.
See this question for more detail.

Find program's code address at runtime?

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.
[edit] To be more specific, I will demonstrate it with following example.
#include <stdio.h>
int main(int argc,char *argv[]){
myfunction();
exit(0);
}
Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.
Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:
void* myfunction_address = myfunction;
If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.
Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).
And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.
And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.
If you know the function name before program runs, simply use
void * addr = myfunction;
If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.
#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>
long find_symbol(char *filename, char *symname)
{
bfd *ibfd;
asymbol **symtab;
long nsize, nsyms, i;
symbol_info syminfo;
char **matching;
bfd_init();
ibfd = bfd_openr(filename, NULL);
if (ibfd == NULL) {
printf("bfd_openr error\n");
}
if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
printf("format_matches\n");
}
nsize = bfd_get_symtab_upper_bound (ibfd);
symtab = malloc(nsize);
nsyms = bfd_canonicalize_symtab(ibfd, symtab);
for (i = 0; i < nsyms; i++) {
if (strcmp(symtab[i]->name, symname) == 0) {
bfd_symbol_info(symtab[i], &syminfo);
return (long) syminfo.value;
}
}
bfd_close(ibfd);
printf("cannot find symbol\n");
}
To get a backtrace, use execinfo.h as documented in the GNU libc manual.
For example:
#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>
void trace_pom()
{
const int sz = 15;
void *buf[sz];
// get at most sz entries
int n = backtrace(buf, sz);
// output them right to stderr
backtrace_symbols_fd(buf, n, fileno(stderr));
// but if you want to output the strings yourself
// you may use char ** backtrace_symbols (void *const *buffer, int size)
write(fileno(stderr), "\n", 1);
}
void TransferFunds(int n);
void DepositMoney(int n)
{
if (n <= 0)
trace_pom();
else TransferFunds(n-1);
}
void TransferFunds(int n)
{
DepositMoney(n);
}
int main()
{
DepositMoney(3);
return 0;
}
compiled
gcc a.c -o a -g -Wall -Werror -rdynamic
According to the mentioned website:
Currently, the function name and offset only be obtained on systems that use the ELF
binary format for programs and libraries. On other systems, only the hexadecimal return
address will be present. Also, you may need to pass additional flags to the linker to
make the function names available to the program. (For example, on systems using GNU
ld, you must pass (-rdynamic.)
Output
./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]
About a comment in an answer (getting the address of an instruction), you can use this very ugly trick
#include <setjmp.h>
void function() {
printf("in function\n");
printf("%d\n",__LINE__);
printf("exiting function\n");
}
int main() {
jmp_buf env;
int i;
printf("in main\n");
printf("%d\n",__LINE__);
printf("calling function\n");
setjmp(env);
for (i=0; i < 18; ++i) {
printf("%p\n",env[i]);
}
function();
printf("in main again\n");
printf("%d\n",__LINE__);
}
It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output
in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37
have fun!

Resources