segmentation fault after main returns - c

I have a long program in c over Linux, that gives me segmentation fault after main returns.
Its a long program, so I cant post it. So can you help me what can make such error?
Thank You.
Wow, Those answers came really fast. Thank you all.
I think i worked it out, i forgot to malloc a string and used it as buffer.
Now that I've malloced it, it does not signal me with a segmentation fault.
Once again, thank you all.

Guess: you might be accidentally corrupting the stack in main so it's lost the return address. Do you have a string buffer there that you could be overrunning?
If not, you should try:
running the program under valgrind
debugging the program with gdb to catch the crash and see where you are at that point; you can also debug the core file dumped
It might help to install glibc-debug packages if your distro has them since you'll be in glibc code at that point.

Use GDB and print stack trace on SIGSEGV signal. Then at least post that here so we can be a little bit more helpful.
Provided you compiled with:
$ gcc -g prog.c -o prog
Then run it under GDB:
$ gdb ./prog
gdb> r
When you get SIGSEGV signal (Segmentation Fault), do this:
gdb> bt
Then see what's on the stack trace to see what is causing the segmentation fault.

If the segmentation fault arises after main() returns, it usually means that a global defined thing went wrong. It is hard to help you with so little info. Send us more info !
my2c

If it's after main() returns, then according to the Standard all destructors have been run (although I wouldn't put it past an implementation to fudge this some), unless the function atexit() has been used. That function registers a function that will be called after main() returns, effectively (if I'm reading 3.6.3 aright). You might check to see if there is an atexit in your program somewhere, if only for completeness.
Depending on what you mean by "after main returns", you may be running destructors for static objects when the program crashes. Check those. (Also, post what you observed that made you think it was after main() returned. You could be wrong there.)
If not, then you've invoked undefined behavior somewhere, very likely in corrupting the stack somehow. See Rup's answer for suggestions there.

Related

Easiest way to locate a Segmentation Fault

I encountered my first Segmentation Fault today (newbie programmer). After reading up on what a segmentation fault is (Thanks for all of the helpful info on this site, as well as Wikipedia's lengthy explanation), I'm trying to determine the easiest way to go about finding where my fault is occuring. It's written in C and the error is occuring on a *NIX based system (I'm not sure which one to be honest... 99% sure it's Linux). I can't exactly post my code as I have numerous files that I'm compiling that are all quite lengthy. I was just hoping for some best practices you have all observed. Thanks for your help.
P.s. I'm thinking the error is coming from dereferencing a NULL pointer or using an uninitialized pointer. However, I could definitely be wrong.
Use a debugger, such as gdb or if this is not applicable a strace tool to get a better insight into where the segfault occurs.
If you use gcc, make sure you compile with -g switch to include debugging information. Then, gdb will show you the exact location in a source code where it segfaults.
For example, if we have this obvious segfaulty program:
new.c
#include <stdio.h>
int main()
{
int *i = 0x478734;
printf("%d", *i);
}
We compile it with gcc -g new.c -o new and then run the gdb session with gdb new:
We issue the run command in the interactive session and the else is clear:
(gdb) run
Starting program: /home/Tibor/so/new
[New Thread 9596.0x16a0]
[New Thread 9596.0x1de4]
Program received signal SIGSEGV, Segmentation fault.
0x0040118a in main () at new.c:6
6 printf("%d", *i);
(gdb)
As DasMoeh and netcoder have pointed out, when segfault has occured, you can use the backtrace command in the interactive session to print a call stack. This can aid in further pinpointing the location of a segfault.
The easiest way is to use valgrind. It will pinpoint to the location where the invalid access occours (and other problems which didn't cause crash but were still invalid). Of course the real problem could be somewhere else in the code (eg: invalid pointer), so the next step is to check the source, and if still confused, use a debugger.
+1 for Tibors answer.
On larger programs or if you use additional libraries it may also be useful look at the backtrace with gdb: ftp://ftp.gnu.org/pub/old-gnu/Manuals/gdb/html_node/gdb_42.html
I reopen this posts for people passing here since I've just corrected a segfault I've made using gcc.
You should consider using the flag -fsanitize=address which can sometimes highlight your segfault with high precision.

Under what conditions can lua_close raise a Segmentation Fault error?

I'm talking about the Lua-C API. A call to lua_close(lua_State *) results in a segmentation fault, even if the state is valid. How do I know the state is valid? Because I've used it correctly up to that point.
I'd post the source but it's too long and I'm not sure it would be helpful. It simply throws a segmentation fault error at me and I have no clue why. The Lua stack is empty before the call. Can somebody help me?
The Lua C API should never result in a segmentation fault. If the segfault happens when calling lua_close, the most probable reason is that some userdata with custom __gc metamethods are failing. From the documentation of lua_close:
Destroys all objects in the given Lua state (calling the corresponding garbage-collection metamethods, if any)
The best way to determine what is the reason for these segfaults is run gdb and get a backtrace when it happens. If you compile your library with debug symbols, you should get exactly to the place that causes errors.
You say that the Lua stack is empty before the call. But is the function to be called on the stack? It should be, even if you call lua_call(L,0,0). Try also rebuilding Lua with API assertions on. It may give you a better error message.

Debugging a crash within C preprocessor macro with gdb

I have a C program with a multi-line macro and the program crashes within the macro, how can I pinpoint the location within the macro where the crash happens?
Here is a simplified version of my program. In reality CRASHES is multiple lines long and not easily expandable manually.
#include <stdio.h>
#include <stdarg.h>
#define CRASHES(ptr) \
(*(ptr) == 123)
main()
{
char *foo = NULL;
if (CRASHES(foo))
printf("This will never happen.");
}
When compiling and running this with gdb a.out I get the expected EXC_BAD_ACCESS (I am on Mac OS X with gdb 6.3), however the crash points to line 8 and not line 4 where the crash is actually caused.
I already tried compiling the program with additional debugging flags -gdwarf-2 and -g3 as suggested by the docs and inserted several assert()s within the macro itself. Unfortunately that did not provide more information.
lots of valuable information here about macro debugging.
...another approach is to use the preprocessor, i.e. compile it using -E and copy-paste the expanded macro into your src-code and see if you can debug from there.
Of course this crashes since you are deferencing a NULL pointer...(it was not this the question right?). With this particular example, it is easy: gcc -g2, and gdb says
Program received signal SIGSEGV, Segmentation fault.
0x080483d9 in main () at crash.c:10
10 if (CRASHES(foo))
which is rather clear, you expand by yourself the macro and see why (since *foo == 123 access memory you can't read, since foo is NULL). In more complex cases, gcc -E helps, or avoid using macros.
You don't say anything about how it crashes. If it's a segfault, be aware that it might occur a bit later than when you actually dereferenced the bad pointer value.
I there any way you could convert it to an actual function? This is one of the great evils of macros.

On debug of c program

I have a C program which throws segfault. However, as I use gdb to find out where the error is thrown. I get following stack info... I dont understand why #1 points to ??(). What is the possible reason for this? Thanks.
#0 __longjmp () at ../sysdeps/i386/__longjmp.S:68
#1 0x43746a57 in ?? ()
In order to debug your program, you need to compile it with debugging symbols included, which you can do by using the -g3 flag if compiling using GCC. When you run the debug version of your program in GDB and execute bt (for "backtrace") you should get a more sensible piece of output.
gdb doesn't know the name of the function so it puts ??.
have you tried compiling with debug symbols?
If longjmp() goes astray as it seems it is, then the problem is likely that you're abusing it - either by passing a jmpbuf that was never initialized by a setjmp() call, or by passing a jmpbuf that was set in a routine that has since returned.
For how to find out more with debugging information, see the other answers and compiling with the -g option.

How do I know which illegal address the program access when a segmentation fault happens

Plus, The program runs on a arm device running Linux, I can print out stack info and register values in the sig-seg handler I assign.
The problem is I can't add -g option to the source file, since the bug may won't reproduce due to performance downgrade.
Compiling with the -g option to gcc does not cause a "performance downgrade". All it does is cause debugging symbols to be included; it does not affect the optimisation or code generation.
If you install your SIGSEGV handler using the sa_sigaction member of the sigaction struct passed to sigaction(), then the si_addr member of the siginfo_t structure passed to your handler contains the faulting address.
I tend to use valgrind which indicates leaks and memory access faults.
This seems to work
http://tlug.up.ac.za/wiki/index.php/Obtaining_a_stack_trace_in_C_upon_SIGSEGV
static void signal_segv(int signum, siginfo_t* info, void*ptr) {
// info->si_addr is the illegal address
}
If you are worried about using -g on the binary that you load on the device, you may be able to use gdbserver on the ARM device with a stripped version of the executable and run arm-gdb on your development machine with the unstripped version of the executable. The stripped version and the unstripped version need to match up to do this, so do this:
# You may add your own optimization flags
arm-gcc -g program.c -o program.debug
arm-strip --strip-debug program.debug -o program
# or
arm-strip --strip-unneeded program.debug -o program
You'll need to read the gdb and gdbserver documentation to figure out how to use them. It's not that difficult, but it isn't as polished as it could be. Mainly it's very easy to accidentally tell gdb to do something that it ends up thinking you meant to do locally, so it will switch out of remote debugging mode.
You may also want to use the backtrace() function if available, that will provide the call stack at the time of the crash. This can be used in order to dump the stack like it happens in an high level programming language when a C program gets a segmentation fault, bus error, or other memory violation error.
backtrace() is available both on Linux and Mac OS X
If the -g option makes the error disappear, then knowing where it crashes is unlikely to be useful anyway. It's probably writing to an uninitialized pointer in function A, and then function B tries to legitimately use that memory, and dies. Memory errors are a pain.

Resources