Reason for Segmentation Fault - c

I have written a program using clone() system call having CLONE_VM and CLONE_FILES set.
I am not able to understand why the output is showing Segmentation Fault. Can somebody please correct my code and tell me the reason for the same.
#include<stdio.h>
#include<unistd.h>
#include<fcntl.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<sched.h>
#include<stdlib.h>
int variable, fd;
int do_something() {
// sleep(100);
variable = 42;
close(fd);
_exit(0);
}
int main(int argc, char *argv[]) {
void **child_stack;
char tempch;
variable = 9;
fd = open("test.file", O_RDONLY);
child_stack = (void **) malloc(16384);
printf("The variable was %d\n", variable);
clone(do_something, child_stack, CLONE_VM|CLONE_FILES, NULL);
// sleep(100);
printf("The variable is now %d\n", variable);
if (read(fd, &tempch, 1) < 1) {
perror("File Read Error");
exit(1);
}
printf("We could read from the file\n");
return 0;
}

You need to know which direction stack grows on your processor, and you need to know which end of the stack you must pass to clone().
From man clone:
Stacks grow downwards on all processors that run Linux (except the
HP PA processors), so child_stack usually points to the topmost
address of the memory space set up for the child stack.
You are not passing the topmost address, you are passing the bottommost address, and you are not (I am guessing) on HP-PA.
Fix:
child_stack = (void **) malloc(16384) + 16384 / sizeof(*child_stack);
P.S. I am astonished by the number of obviously wrong non-answers here.
No, close on invalid file descriptor
does not crash on any UNIX and
Linux system in existence.
No, void* vs. void** has nothing at all to do with the problem.
No, you don't need to take an address of do_something, the compiler will do that automatically for you.
And finally, yes: calling close, _exit, or any other libc routine in the clone()d thread is potentially unsafe, although it does not cause the problem here.

The way to fix is to have the child stack actually on the stack .. i.e.
char child_stack [16384];
I suspect that stack pointer can't point to data segment or sth like that...
And even then.. it works with -g .. but crashes with -O !!!

Related

How to understand the type of storage of a pointer

I have as an homework this task:
Given a void** ptr_addr write a function that return 0 if the type of storage of *ptr_addr is static or automatic and return 1 if the type of storage of *ptr_addr is dynamic.
The language of the code must be C.
The problem is that theoretically I know what the task is about but I don't know how to check the
previous condition with a code.
Thanks for the help!
Normally I don't do homework, but in cases like this I may make an exception.
Bear in mind that what I'm about to present is horrible code. Also it doesn't meet your requirements as stated — you'll have to adapt it for that. Also it may not meet your instructor's expectations: for an instructor demented enough to be assigning this task, I can't begin to guess his (her? its?) expectations. You may get dinged for using the technique I've presented, or for presenting someone else's work. Also I'm going to get dinged for presenting this code here on Stack Overflow, because no, it's nothing like portable or guaranteed to do anything, let alone to work. I have no idea whether it'll work on your system.
Nevertheless, and may God help me, I tested it, and it does "work" on a modern Debian Linux system.
#include <unistd.h>
extern etext, edata, end;
char *
mcat(void *p)
{
int dummy;
if(p < &etext)
return "text";
else if(p < &edata)
return "data";
else if(p < &end)
return "bss";
else if(p < sbrk(0))
return "heap";
else if(p > &dummy)
return "stack";
else return "?";
}
You'll get a good number of warnings if you compile this, which could theoretically be silenced using some explicit casts, but I think the warnings are actually pretty appropriate, given the nefariousness of this code.
How it works: on at least some Unix-like systems, etext, edata, and end are magic symbols corresponding to the ends of the program's text, initialized data, and uninitialized data segments, respectively. sbrk(0) gives you a pointer to the top of the heap that a traditional implementation of malloc is using. And &dummy is a good approximation of the bottom of the stack.
Test program:
#include <stdio.h>
#include <stdlib.h>
int g = 2;
int g2;
int main()
{
int l;
static int s = 3;
static int s2;
int *p = malloc(sizeof(int));
printf("g: %s\n", mcat(&g));
printf("g2: %s\n", mcat(&g2));
printf("main: %s\n", mcat(main));
printf("l: %s\n", mcat(&l));
printf("s: %s\n", mcat(&s));
printf("s2: %s\n", mcat(&s2));
printf("p: %s\n", mcat(p));
}
On my test system this prints
g: data
g2: bss
main: text
l: stack
s: data
s2: bss
p: heap
I'd like to post a different approach to solve the problem:
// this function returns 1 if ptr has been allocated by malloc/calloc/realloc, otherwise 0
int is_pointer_heap(void* ptr) {
pid_t p = fork();
if (p == 0) {
(void) realloc(ptr, 1);
exit(0);
}
int status;
(void) waitpid(p, &status, 0);
return (status == 0) ? 1 : 0;
}
I wrote this (bad) code very quickly (and there's lot of room for improvements), but I tested it and it seems to work.
EXPLANATION: realloc() will crash your process if the argument passed to it is not a malloc/calloc/realloc-allocated pointer. Here we create a new child process, we let the child process call realloc(); if the child process crashes, we return 0, otherwise we return 1.

Accessing the variable inside another code

Is there a way to access a variable initialized in one code from another code. For eg. my code1.c is as follows,
# include <stdio.h>
int main()
{
int a=4;
sleep(99);
printf("%d\n", a);
return 0;
}
Now, is there any way that I can access the value of a from inside another C code (code2.c)? I am assuming, I have all the knowledge of the variable which I want to access, but I don't have any information about its address in the RAM. So, is there any way?
I know about the extern, what I am asking for here is a sort of backdoor. Like, kind of searching for the variable in the RAM based on some properties.
Your example has one caveat, set aside possible optimizations that would make the variable to dissapear: variable a only exists while the function is being executed and has not yet finished.
Well, given that the function is main() it shouldn't be a problem, at least, for standard C programs, so if you have a program like this:
# include <stdio.h>
int main()
{
int a=4;
printf("%d\n", a);
return 0;
}
Chances are that this code will call some functions. If one of them needs to access a to read and write to it, just pass a pointer to a as an argument to the function.
# include <stdio.h>
int main()
{
int a=4;
somefunction(&a);
printf("%d\n", a);
return 0;
}
void somefunction (int *n)
{
/* Whatever you do with *n you are actually
doing it with a */
*n++; /* actually increments a */
}
But if the function that needs to access a is deep in the function call stack, all the parent functions need to pass the pointer to a even if they don't use it, adding clutter and lowering the readability of code.
The usual solution is to declare a as global, making it accessible to every function in your code. If that scenario is to be avoided, you can make a visible only for the functions that need to access it. To do that, you need to have a single source code file with all the functions that need to use a. Then, declare a as static global variable. So, only the functions that are written in the same source file will know about a, and no pointer will be needed. It doesn't matter if the functions are very nested in the function call stack. Intermediate functions won't need to pass any additional information to make a nested function to know about a
So, you would have code1.c with main() and all the functions that need to access a
/* code1.c */
# include <stdio.h>
static int a;
void somefunction (void);
int main()
{
a=4;
somefunction();
printf("%d\n", a);
return 0;
}
void somefunction (void)
{
a++;
}
/* end of code1.c */
About trying to figure out where in RAM is a specific variable stored:
Kind of. You can travel across function stack frames from yours to the main() stack frame, and inside those stack frames lie the local variables of each function, but there is no sumplementary information in RAM about what variable is located at what position, and the compiler may choose to put it wherever it likes within the stack frame (or even in a register, so there would be no trace of it in RAM, except for push and pops from/to general registers, which would be even harder to follow).
So unless that variable has a non trivial value, it's the only local variable in its stack frame, compiler optimizations have been disabled, your code is aware of the architecture and calling conventions being used, and the variable is declared as volatile to stop being stored in a CPU register, I think there is no safe and/or portable way to find it out.
OTOH, if your program has been compiled with -g flag, you might be able to read debugging information from within your program and find out where in the stack frame the variable is, and crawl through it to find it.
code1.c:
#include <stdio.h>
void doSomething(); // so that we can use the function from code2.c
int a = 4; // global variable accessible in all functions defined after this point
int main()
{
printf("main says %d\n", a);
doSomething();
printf("main says %d\n", a);
return 0;
}
code2.c
#include <stdio.h>
extern int a; // gain access to variable from code1.c
void doSomething()
{
a = 3;
printf("doSomething says %d\n", a);
}
output:
main says 4
doSomething says 3
main says 3
You can use extern int a; in every file in which you must use a (code2.c in this case), except for the file in which it is declared without extern (code1.c in this case). For this approach to work you must declare your a variable globally (not inside a function).
One approach is to have the separate executable have the same stack layout as the program in question (since the variable is placed on the stack, and we need the relative address of the variable), therefore compile it with the same or similar compiler version and options, as much as possible.
On Linux, we can read the running code's data with ptrace(PTRACE_PEEKDATA, pid, …). Since on current Linux systems the start address of the stack varies, we have to account for that; fortunately, this address can be obtained from the 28th field of /proc/…/stat.
The following program (compiled with cc Debian 4.4.5-8 and no code generator option on Linux 2.6.32) works; the pid of the running program has to be specified as the program argument.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ptrace.h>
void *startstack(char *pid)
{ // The address of the start (i. e. bottom) of the stack.
char str[FILENAME_MAX];
FILE *fp = fopen(strcat(strcat(strcpy(str, "/proc/"), pid), "/stat"), "r");
if (!fp) perror(str), exit(1);
if (!fgets(str, sizeof str, fp)) exit(1);
fclose(fp);
unsigned long address;
int i = 28; char *s = str; while (--i) s += strcspn(s, " ") + 1;
sscanf(s, "%lu", &address);
return (void *)address;
}
static int access(void *a, char *pidstr)
{
if (!pidstr) return 1;
int pid = atoi(pidstr);
if (ptrace(PTRACE_ATTACH, pid, 0, 0) < 0) return perror("PTRACE_ATTACH"), 1;
int status;
// wait for program being signaled as stopped
if (wait(&status) < 0) return perror("wait"), 1;
// relocate variable address to stack of program in question
a = a-startstack("self")+startstack(pidstr);
int val;
if (errno = 0, val = ptrace(PTRACE_PEEKDATA, pid, a, 0), errno)
return perror("PTRACE_PEEKDATA"), 1;
printf("%d\n", val);
return 0;
}
int main(int argc, char *argv[])
{
int a;
return access(&a, argv[1]);
}
Another, more demanding approach would be as mcleod_ideafix indicated at the end of his answer to implement the bulk of a debugger and use the debug information (provided its presence) to locate the variable.

change a pointer of address of another application

I need somebody to edit the title, I can't find better title.
Assume a have this simple program called source.exe:
#include <stdio.h>
int main()
{
int a = 5;
printf("%p", &a);
return 0;
}
I want to write another application, change.exe, that changes a in the above.
I tried something like this:
int main()
{
int * p = (int*) xxx; // xxx is what have printed above
*p = 1;
printf("%d", *p);
return 0;
}
It doesn't work. assuming I have Administrator rights, is there a way to do what I've tried above? thanks.
In first place, when you run the second program, the a in the first will be long gone (or loaded in a different position). In second place, many OS's protect programs by loading them in separate spaces.
What you really seem to be looking for is Inter-Process Communication (IPC) mechanisms, specifically shared memory or memory-mapped files.
On most traditional computers that people deal with, the operating system makes use of virtual memory. This means that two processes can both use address 0x12340000 and it can refer to two different pieces of memory.
This is helpful for a number of reasons, including memory fragmentation, and allowing multiple applications to start and stop at random times.
On some systems, like TI DSPs for example, there is no MMU, and thus no virtual memory. On these systems, something like your demo application could work.
I was feeling a bit adventurous, so I thought about writing something like this under Windows, using the WinAPI, of course. Like Linux's ptrace, the calls used by this code should only be used by debuggers and aren't normally seen in any normal application code.
Furthermore, opening another process' memory for writing requires you to open the process handle with PROCESS_VM_WRITE and PROCESS_VM_OPERATION privileges. This, however, is only possible if the application opening the process has the SeDebugPriviledge priviledge enabled. I ran the application in elevated mode with administrator privileges, however I don't really know if that has any effect on the SeDebugPriviledge.
Anyhow, here's the code that I used for this. It was compiled with VS2008.
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char cmd[2048];
int a = 5;
printf("%p %d\n", &a, a);
sprintf(cmd, "MemChange.exe %lu %x", GetCurrentProcessId(), &a);
system(cmd);
printf("%p %d\n", &a, a);
return 0;
}
And here's the code for MemChange.exe that this code calls.
#include <windows.h>
#include <stdio.h>
int main(int argc, char **argv)
{
DWORD pId;
LPVOID pAddr;
HANDLE pHandle;
SIZE_T bytesWritten;
int newValue = 666;
sscanf(argv[1], "%lu", &pId);
sscanf(argv[2], "%x", &pAddr);
pHandle = OpenProcess(PROCESS_ALL_ACCESS, FALSE, pId);
WriteProcessMemory(pHandle, pAddr, &newValue, sizeof(newValue), &bytesWritten);
CloseHandle(pHandle);
fprintf(stderr, "Written %u bytes to process %u.\n", bytesWritten, pId);
return 0;
}
But please don't use this code. It is horrible, has no error checks and probably leaks like holy hell. It was created only to illustrate what can be done with WriteProcessMemory. Hope it helps.
Why do you think that this is possible - debuggers can only read?
If it was possible then all sorts of mayhem could happen!
Shared memory springs to mind.

When should errno be assigned to ENOMEM?

The following program is killed by the kernel when the memory is ran out. I would like to know when the global variable should be assigned to "ENOMEM".
#define MEGABYTE 1024*1024
#define TRUE 1
int main(int argc, char *argv[]){
void *myblock = NULL;
int count = 0;
while(TRUE)
{
myblock = (void *) malloc(MEGABYTE);
if (!myblock) break;
memset(myblock,1, MEGABYTE);
printf("Currently allocating %d MB\n",++count);
}
exit(0);
}
First, fix your kernel not to overcommit:
echo "2" > /proc/sys/vm/overcommit_memory
Now malloc should behave properly.
As "R" hinted, the problem is the default behaviour of Linux memory management, which is "overcommiting". This means that the kernel claims to allocate you memory successfuly, but doesn't actually allocate the memory until later when you try to access it. If the kernel finds out that it's allocated too much memory, it kills a process with "the OOM (Out Of Memory) killer" to free up some memory. The way it picks the process to kill is complicated, but if you have just allocated most of the memory in the system, it's probably going to be your process that gets the bullet.
If you think this sounds crazy, some people would agree with you.
To get it to behave as you expect, as R said:
echo "2" > /proc/sys/vm/overcommit_memory
It happens when you try to allocate too much memory at once.
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
int main(int argc, char *argv[])
{
void *p;
p = malloc(1024L * 1024 * 1024 * 1024);
if(p == NULL)
{
printf("%d\n", errno);
perror("malloc");
}
}
In your case the OOM killer is getting to the process first.
I think errno will be set to ENOMEM:
Macro defined in stdio.h. Here is the documentation.
#define ENOMEM 12 /* Out of Memory */
After you call malloc in this statement:
myblock = (void *) malloc(MEGABYTE);
And the function returns NULL -because system is out of memory -.
I found this SO question very interesting.
Hope it helps!

Best way to invoke gdb from inside program to print its stacktrace?

Using a function like this:
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
void print_trace() {
char pid_buf[30];
sprintf(pid_buf, "--pid=%d", getpid());
char name_buf[512];
name_buf[readlink("/proc/self/exe", name_buf, 511)]=0;
int child_pid = fork();
if (!child_pid) {
dup2(2,1); // redirect output to stderr
fprintf(stdout,"stack trace for %s pid=%s\n",name_buf,pid_buf);
execlp("gdb", "gdb", "--batch", "-n", "-ex", "thread", "-ex", "bt", name_buf, pid_buf, NULL);
abort(); /* If gdb failed to start */
} else {
waitpid(child_pid,NULL,0);
}
}
I see the details of print_trace in the output.
What are other ways to do it?
You mentioned on my other answer (now deleted) that you also want to see line numbers. I'm not sure how to do that when invoking gdb from inside your application.
But I'm going to share with you a couple of ways to print a simple stacktrace with function names and their respective line numbers without using gdb. Most of them came from a very nice article from Linux Journal:
Method #1:
The first method is to disseminate it
with print and log messages in order
to pinpoint the execution path. In a
complex program, this option can
become cumbersome and tedious even if,
with the help of some GCC-specific
macros, it can be simplified a bit.
Consider, for example, a debug macro
such as:
#define TRACE_MSG fprintf(stderr, __FUNCTION__ \
"() [%s:%d] here I am\n", \
__FILE__, __LINE__)
You can propagate this macro quickly
throughout your program by cutting and
pasting it. When you do not need it
anymore, switch it off simply by
defining it to no-op.
Method #2: (It doesn't say anything about line numbers, but I do on method 4)
A nicer way to get a stack backtrace,
however, is to use some of the
specific support functions provided by
glibc. The key one is backtrace(),
which navigates the stack frames from
the calling point to the beginning of
the program and provides an array of
return addresses. You then can map
each address to the body of a
particular function in your code by
having a look at the object file with
the nm command. Or, you can do it a
simpler way--use backtrace_symbols().
This function transforms a list of
return addresses, as returned by
backtrace(), into a list of strings,
each containing the function name
offset within the function and the
return address. The list of strings is
allocated from your heap space (as if
you called malloc()), so you should
free() it as soon as you are done with
it.
I encourage you to read it since the page has source code examples. In order to convert an address to a function name you must compile your application with the -rdynamic option.
Method #3: (A better way of doing method 2)
An even more useful application for
this technique is putting a stack
backtrace inside a signal handler and
having the latter catch all the "bad"
signals your program can receive
(SIGSEGV, SIGBUS, SIGILL, SIGFPE and
the like). This way, if your program
unfortunately crashes and you were not
running it with a debugger, you can
get a stack trace and know where the
fault happened. This technique also
can be used to understand where your
program is looping in case it stops
responding
An implementation of this technique is available here.
Method #4:
A small improvement I've done on method #3 to print line numbers. This could be copied to work on method #2 also.
Basically, I followed a tip that uses addr2line to
convert addresses into file names and
line numbers.
The source code below prints line numbers for all local functions. If a function from another library is called, you might see a couple of ??:0 instead of file names.
#include <stdio.h>
#include <signal.h>
#include <stdio.h>
#include <signal.h>
#include <execinfo.h>
void bt_sighandler(int sig, struct sigcontext ctx) {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
if (sig == SIGSEGV)
printf("Got signal %d, faulty address is %p, "
"from %p\n", sig, ctx.cr2, ctx.eip);
else
printf("Got signal %d\n", sig);
trace_size = backtrace(trace, 16);
/* overwrite sigaction with caller's address */
trace[1] = (void *)ctx.eip;
messages = backtrace_symbols(trace, trace_size);
/* skip first stack frame (points here) */
printf("[bt] Execution path:\n");
for (i=1; i<trace_size; ++i)
{
printf("[bt] #%d %s\n", i, messages[i]);
/* find first occurence of '(' or ' ' in message[i] and assume
* everything before that is the file name. (Don't go beyond 0 though
* (string terminator)*/
size_t p = 0;
while(messages[i][p] != '(' && messages[i][p] != ' '
&& messages[i][p] != 0)
++p;
char syscom[256];
sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, messages[i]);
//last parameter is the file name of the symbol
system(syscom);
}
exit(0);
}
int func_a(int a, char b) {
char *p = (char *)0xdeadbeef;
a = a + b;
*p = 10; /* CRASH here!! */
return 2*a;
}
int func_b() {
int res, a = 5;
res = 5 + func_a(a, 't');
return res;
}
int main() {
/* Install our signal handler */
struct sigaction sa;
sa.sa_handler = (void *)bt_sighandler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
sigaction(SIGSEGV, &sa, NULL);
sigaction(SIGUSR1, &sa, NULL);
/* ... add any other signal here */
/* Do something */
printf("%d\n", func_b());
}
This code should be compiled as: gcc sighandler.c -o sighandler -rdynamic
The program outputs:
Got signal 11, faulty address is 0xdeadbeef, from 0x8048975
[bt] Execution path:
[bt] #1 ./sighandler(func_a+0x1d) [0x8048975]
/home/karl/workspace/stacktrace/sighandler.c:44
[bt] #2 ./sighandler(func_b+0x20) [0x804899f]
/home/karl/workspace/stacktrace/sighandler.c:54
[bt] #3 ./sighandler(main+0x6c) [0x8048a16]
/home/karl/workspace/stacktrace/sighandler.c:74
[bt] #4 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x3fdbd6]
??:0
[bt] #5 ./sighandler() [0x8048781]
??:0
Update 2012/04/28 for recent linux kernel versions, the above sigaction signature is obsolete. Also I improved it a bit by grabbing the executable name from this answer. Here is an up to date version:
char* exe = 0;
int initialiseExecutableName()
{
char link[1024];
exe = new char[1024];
snprintf(link,sizeof link,"/proc/%d/exe",getpid());
if(readlink(link,exe,sizeof link)==-1) {
fprintf(stderr,"ERRORRRRR\n");
exit(1);
}
printf("Executable name initialised: %s\n",exe);
}
const char* getExecutableName()
{
if (exe == 0)
initialiseExecutableName();
return exe;
}
/* get REG_EIP from ucontext.h */
#define __USE_GNU
#include <ucontext.h>
void bt_sighandler(int sig, siginfo_t *info,
void *secret) {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
ucontext_t *uc = (ucontext_t *)secret;
/* Do something useful with siginfo_t */
if (sig == SIGSEGV)
printf("Got signal %d, faulty address is %p, "
"from %p\n", sig, info->si_addr,
uc->uc_mcontext.gregs[REG_EIP]);
else
printf("Got signal %d\n", sig);
trace_size = backtrace(trace, 16);
/* overwrite sigaction with caller's address */
trace[1] = (void *) uc->uc_mcontext.gregs[REG_EIP];
messages = backtrace_symbols(trace, trace_size);
/* skip first stack frame (points here) */
printf("[bt] Execution path:\n");
for (i=1; i<trace_size; ++i)
{
printf("[bt] %s\n", messages[i]);
/* find first occurence of '(' or ' ' in message[i] and assume
* everything before that is the file name. (Don't go beyond 0 though
* (string terminator)*/
size_t p = 0;
while(messages[i][p] != '(' && messages[i][p] != ' '
&& messages[i][p] != 0)
++p;
char syscom[256];
sprintf(syscom,"addr2line %p -e %.*s", trace[i] , p, messages[i] );
//last parameter is the filename of the symbol
system(syscom);
}
exit(0);
}
and initialise like this:
int main() {
/* Install our signal handler */
struct sigaction sa;
sa.sa_sigaction = (void *)bt_sighandler;
sigemptyset (&sa.sa_mask);
sa.sa_flags = SA_RESTART | SA_SIGINFO;
sigaction(SIGSEGV, &sa, NULL);
sigaction(SIGUSR1, &sa, NULL);
/* ... add any other signal here */
/* Do something */
printf("%d\n", func_b());
}
If you're using Linux, the standard C library includes a function called backtrace, which populates an array with frames' return addresses, and another function called backtrace_symbols, which will take the addresses from backtrace and look up the corresponding function names. These are documented in the GNU C Library manual.
Those won't show argument values, source lines, and the like, and they only apply to the calling thread. However, they should be a lot faster (and perhaps less flaky) than running GDB that way, so they have their place.
nobar posted a fantastic answer. In short;
So you want a stand-alone function that prints a stack trace with all of the features that gdb stack traces have and that doesn't terminate your application. The answer is to automate the launch of gdb in a non-interactive mode to perform just the tasks that you want.
This is done by executing gdb in a child process, using fork(), and scripting it to display a stack-trace while your application waits for it to complete. This can be performed without the use of a core-dump and without aborting the application.
I believe that this is what you are looking for, #Vi
Isn't abort() simpler?
That way if it happens in the field the customer can send you the core file (I don't know many users who are involved enough in my application to want me to force them to debug it).

Resources