Get function name that made a (illegal) memory access - c

For debugging reasons I am trying to print the name of the function that made an illegal memory access (out of range for example).
I have written a SIGSEGV signal handler to print the Instruction Pointer (IP) and the faulted memory address, but I was not able to create a method such that I can get the function name and not the IP.
Is there a way to modify the signature of the signal handler to pass the __ FUNCTION __ variable as an argument or is there another method to do this?
Note: The program is written in C and I trying to do this in a Linux environment.

You can use backtrace and backtrace_symbols_fd to print the function names with call stack.
Check the man pages for backtrace and backtrace_symbols or backtrace_symbols_fd.
An example is here:
http://man7.org/linux/man-pages/man3/backtrace.3.html

Related

Reading variables while debugging with GDB (C)

I am beginner with GDB debbuging. I need read variables in GDB, I use the command info variable and get this information:
0x000007c4 variable1.0
0x000007c8 variable2.1
I set a breakpoint inside the variables function and these are defined how type long *. How can I read the value inside these correctly? I tried with show, display, print $variable1, p/x variable and so on commands.
Sorry for my grammar, i am not native speaker.
To view the contents of memory use gdb's x/FMT ADDRESS command e.g. x/d 0x000007c4 (to display an integer sized object from address 0x000007c4 and format it in decimal).
The info variables command in gdb will list all global and static variables and their program addresses. You don't describe the language or implementation you're using, but in C the variable name "variable1.0" is not valid. Therefore it must have been created by some link editor or the compiler in a post-process. Therefore the symbol may not exist in debug information and is only accessible by directly viewing the contents of memory, which is why the gdb p command doesn't work (there is no valid expression to show you that variable because it's not a variable, but just a symbol at an address).

What does the getcontext system call (ucontext.h) really do?

I took operating systems last year, during which I used user contexts (defined in the header ucontext.h) to implement a thread scheduler (in which each thread simulated a process) for a project. I'm taking part in a lecture and will talk about user contexts, and it just occurred to me that, despite having done this project last year, I don't really understand what exactly the getcontext system call actually does.
The man pages for getcontext states that it
initializes the structure pointed at by ucp to the currently active context."
It also states, for the argument to setcontext, that if the ucp argument
was obtained by a call of getcontext(), program execution continues as if this call just returned.
Okay, so I understand that.
So here's what I'm confused about. Typically, for the way I learned it, to perform a context switch, one would initialize the ucontext_t struct and swap/set it as such:
ucontext_t ucp;
ucontext_t oucp;
getcontext(&ucp);
// Initialize the stack_t struct in the ucontext_t struct
ucp.uc_stack.ss_sp = malloc(STACK_SIZE);
ucp.uc_stack.ss_size = STACK_SIZE;
ucp.uc_stack.ss_flags = 0;
ucp.uc_link = /* some other context, or just NULL */;
// Don't block any signals in this context
sigemptyset(&ucp.uc_sigmask);
// Assume that fn is a function that takes 0 arguments and returns void
makecontext(&ucp, fn, 0);
// Perform the context switch. Function 'fn' will be active now
swapcontext(&oucp, &ucp);
// alternatively: setcontext(&ucp);
If I omit getcontext in smaller programs, nothing interesting happens. In somewhat larger programs in which there is more context switching via user contexts, I get a segmentation fault that is only resolved by adding getcontext back in.
What exactly does getcontext do? Why can't I just allocate a ucontext_t struct, initialize it by initializing the uc_stack and uc_sigmask fields, and calling makecontext without the getcontext? Is there some necessary initialization that getcontext performs that makecontext does not perform?
I looked at the GNU libc implementation for ucontext on x86/linux architectures, so, there might be different implementations for which the following does not hold.
The GNU libc manual states that:
The ucp parameter passed to the makecontext shall be initialized by a call to getcontext.
If you look at mcontext_t in glibc/sysdeps/unix/linux/x86/sys/ucontext.h there is a pointer to the floating point state (fpregset_t fpregs) that is initialized in getcontext() and dereferenced again in setcontext(). However, it is not initialized using makecontext(). I did a quick test with GDB and I got a segfault in setcontext() when trying to dereference the pointer to the floating point context in a ucontext_t struct not initialized by getcontext():
=> 0x00007ffff784308c <+44>: fldenv (%rcx)

What happens with the return value of main()? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What should main() return in C/C++?
#include<stdio.h>
int main()
{
return 0;
}
In the code snippet given above, where does the return 0 returned by main go? Or in other words which function called the main function in the beginning.
main is called by some startup function in the C runtime library. The C language standard says that returning from main is equivalent to calling the exit function, so most C runtimes look something like this:
void _start(void) /* Exact function signature may vary */
{
/* Platform-specifi startup (e.g. fetch argc and argv from the OS) */
...
int status = main(argc, argv);
exit(status);
/* Never reached */
}
The exit status gets passed back to the operating system, and then what happens from there is OS-dependent.
When you compile and link your program, the executable file format (e.g. PE or ELF) contains a start address, which is the virtual address at which execution begins. That function is typically part of the C runtime library (like the example _start above). That function has to end by calling a system call such as exit, since if it just returned, it would have nowhere to go to: it would just pop an address off the stack and jump to that location, which would be garbage.
Depending on how the OS loader initializes processes, the program arguments argc, argv, and other data (such as the environment) might either come in as function parameters (either through registers or the stack), or they might require system calls (e.g. GetCommandLine on Windows) to retrieve them. But dealing with all of that is the job of the C runtime, and unless you're explicitly going out of your way to avoid using the C runtime, you needn't worry about those details.
Your compiler targets a certain platform, which includes operating-system specific mechanisms for starting processes. Part of that platform specific code contains the return value of main. When you link your program into an executable, there is an OS-specific piece of binary code that your linker adds which takes care of calling main and reporting the return value back to the operating system.
The return value goes to the hosted environment. Typically, operating system calls main and gets exits status of the program.
where does the return 0 returned by main go? Or in other words which function called the main function in the beginning.
It is called by the C startup library, a stub function that is called (almost) directly by the kernel. For example, on Linux and OS X, it's a function named _start. It has the same signature as main() and the operating system itself uses its return value.

Segmentation fault while accessing a function-static structure via returned pointer

I have the following structure:
struct sys_config_s
{
char server_addr[256];
char listen_port[100];
char server_port[100];
char logfile[PATH_MAX];
char pidfile[PATH_MAX];
char libfile[PATH_MAX];
int debug_flag;
unsigned long connect_delay;
};
typedef struct sys_config_s sys_config_t;
I also have a function defined in a static library (let's call it A.lib):
sys_config_t* sys_get_config(void)
{
static sys_config_t config;
return &config;
}
I then have a program (let's call it B) and a dynamic library (let's call it C). Both B and C link with A.lib. At runtime B opens C via dlopen() and then gets an address to C's function func() via a call to dlsym().
void func(void)
{
sys_get_config()->connect_delay = 1000;
}
The above code is the body of C's func() function and it produces a segmentation fault when reached. The segfault only occurs while running outside of gdb.
Why does that happen?
EDIT: Making sys_config_t config a global variable doesn't help.
The solution is trivial. Somehow, by a header mismatch, the PATH_MAX constant was defined differently in B's and C's compilation units. I need to be more careful in the future. (facepalms)
There is no difference between the variable being a static-local, or a static-global variable. A static variable is STATIC, that means, it is not, on function-call demand, allocated on the stack within the current function frame, but rather it is allocated in one of the preexisting segments of the memory defined in the executable's binary headers.
That's what I'm 100% sure. The question, where in what segment they exactly placed, and whether they are properly shared - is an another problem. I've seen similar problems with sharing global/static variables between modules, but usually, the core of the problem was very specific to the exact setup..
Please take into consideration, that the code sample is small, and I worked on that platforms long time ago. What I've written above might got mis-worded or even be plainly wrong at some points!
I think, that the important thing is that you are getting that segfault in C when touching that line. Setting an integer field to a constant could not have failed, never, provided that target address is valid and not write-protected. That leaves two options:
- either your function sys_get_config() has crashed
- or it has returned an invalid pointer.
Since you say that the segfault is raised here, not in sys_get_config, the only thing left is the latter point: broken pointer.
Add to the sys_get_config some trivial printf that will dump the address-to-be-returned, then do the same in the calling function "func". Check whether it not-null, and also check if within sys_get_config it is the same as after being returned, just to be sure that calling conventions are proper, etc. A good idea for making a double/triple check is to also add inside the module "A" a copy of the function sys_get_config (with different name of course), and to check whether the addresses returned from sys_get_config and it's copy are the same. If they are not - something went very wrong during the linking
There is also a very very small chance that the module loading has been deferred, and you are trying to reference a memory of a module that was not fully initialized yet.. I worked on linux very long time ago, but I remember that dlopen has various loading options. But you wrote that you got the address by dlsym, so I suppose the module has loaded since you've got the symbol's final address..

How to include C backtrace in a kernel module code?

So I am trying to find out what kernel processes are calling some functions in a block driver. I thought including backtrace() in the C library would make it easy. But I am having trouble to load the backtrace.
I copied this example function to show the backtrace:
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/063/6391/6391l1.html
All attempts to compile have error in one place or another that a file cannot be found or that the functions are not defined.
Here is what comes closest.
In the Makefile I put the compiler directives:
-rdynamic -I/usr/include
If I leave out the second one, -I/usr/include, then the compiler reports it cannot find the required header execinfo.h.
Next, in the code where I want to do the backtrace I have copied the function from the example:
//trying to include the c backtrace capability
#include <execinfo.h>
void show_stackframe() {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
trace_size = backtrace(trace, 16);
messages = backtrace_symbols(trace, trace_size);
printk(KERN_ERR "[bt] Execution path:\n");
for (i=0; i<trace_size; ++i)
printk(KERN_ERR "[bt] %s\n", messages[i]);
}
//backtrace function
I have put the call to this function later on, in a block driver function where the first sign of the error happens. Simply:
show_stackframe();
So when I compile it, the following errors:
user#slinux:~/2.6-32$ make -s
Invoking make againt the kernel at /lib/modules/2.6.32-5-686/build
In file included from /usr/include/features.h:346,
from /usr/include/execinfo.h:22,
from /home/linux/2.6-32/block/block26.c:49:
/usr/include/sys/cdefs.h:287:1: warning: "__always_inline" redefined
In file included from /usr/src/linux-headers-2.6.32-5-common/include/linux/compiler-gcc.h:86,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/compiler.h:40,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/stddef.h:4,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/list.h:4,
from /usr/src/linux-headers-2.6.32-5-common/include/linux/module.h:9,
from /home/linux/2.6-32/inc/linux_ver.h:40,
from /home/linux/2.6-32/block/block26.c:32:
/usr/src/linux-headers-2.6.32-5-common/include/linux/compiler-gcc4.h:15:1: warning: this is the location of the previous definition
/home/linux/2.6-32/block/block26.c:50: warning: function declaration isn’t a prototype
WARNING: "backtrace" [/home/linux/2.6-32/ndas_block.ko] undefined!
WARNING: "backtrace_symbols" [/home/linux/2.6-32/ndas_block.ko] undefined!
Note: block26.c is the file I am hoping to get the backtrace from.
Is there an obvious reason why the backtrace and backtrace_symbols remain undefined when it is compiled into the .ko modules?
I am guessing it because I use the compiler include execinfo.h which is residing on the computer and not being loaded to the module.
It is my uneducated guess to say the least.
Can anyone offer a help to get the backtrace functions loading up in the module?
Thanks for looking at this inquiry.
I am working on debian. When I take out the function and such, the module compiles fine and almost works perfectly.
From ndasusers
To print the stack contents and a backtrace to the kernel log, use the dump_stack() function in your kernel module. It's declared in linux/kernel.h in the include folder in the kernel source directory.
If you need to save the stack trace and process its elements somehow, save_stack_trace() or dump_trace() might be also an option. These functions are declared in <linux/stacktrace.h> and <asm/stacktrace.h>, respectively.
It is not as easy to use these as dump_stack() but if you need more flexibility, they may be helpful.
Here is how save_stack_trace() can be used (replace HOW_MANY_ENTRIES_TO_STORE with the value that suits your needs, 16-32 is usually more than enough):
unsigned long stack_entries[HOW_MANY_ENTRIES_TO_STORE];
struct stack_trace trace = {
.nr_entries = 0,
.entries = &stack_entries[0],
.max_entries = HOW_MANY_ENTRIES_TO_STORE,
/* How many "lower entries" to skip. */
.skip = 0
};
save_stack_trace(&trace);
Now stack_entries array contains the appropriate call addresses. The number of elements filled is nr_entries.
One more thing to point out. If it is desirable not to output the stack entries that belong to the implementation of save_stack_trace(), dump_trace() or dump_stack() themselves (on different systems, the number of such entries may vary), the following trick can be applied if you use save_stack_trace(). You can use __builtin_return_address(0) as an "anchor" entry and process only the entries "not lower" than that.
I know this question is about Linux, but since it's the first result for "backtrace kernel", here's a few more solutions:
DragonFly BSD
It's print_backtrace(int count) from /sys/sys/systm.h. It's implemented in
/sys/kern/kern_debug.c and/or /sys/platform/pc64/x86_64/db_trace.c. It can be found by searching for panic, which is implemented in /sys/kern/kern_shutdown.c, and calls print_backtrace(6) if DDB is defined and trace_on_panic is set, which are both defaults.
FreeBSD
It's kdb_backtrace(void) from /sys/sys/kdb.h. Likewise, it's easy to find by looking into what the panic implementation calls when trace_on_panic is true.
OpenBSD
Going the panic route, it appears to be db_stack_dump(), implemented in /sys/ddb/db_output.c. The only header mention is /sys/ddb/db_output.h.
dump_stack() is function can be used to print your stack and thus can be used to backtrack . while using it be carefull that don't put it in repetitive path like loops or packet receive function it can fill your dmesg buffer can cause crash in embedded device (having less memory and cpu).
This function is declared in linux/kernel.h .

Resources