Disable GDB breaking on x86 exceptions - c

I'm using gdb with bochs-gdb to debug a virtual memory implementation I am writing. Every time an exception 14 (page fault) is thrown gdb breaks on the handler for the exception. Is there any way I can disable this behavior so that gdb doesn't break on x86 exceptions?

You can:
handle SIGSEGV nostop
GDB will not stop for page fault but will still print a message. You can also add noprint.
Source:
"If you don't want GDB to stop for page faults, then issue the command
handle SIGSEGV nostop. GDB will still print a message for every page
fault, but it will not come back to a command prompt." link

Related

C malloc "can't allocate region" error, but can't repro with GDB?

How can I debug a C application that does not crash when attached with gdb and run inside of gdb?
It crashes consistently when run standalone - even the same debug build!
A few of us are getting this error with a C program written for BSD/Linux, and we are compiling on macOS with OpenSSL.
app(37457,0x7000017c7000) malloc: *** mach_vm_map(size=13835058055282167808) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
ERROR: malloc(buf->length + 1) failed!
I know, not helpful.
Recompiling the application with -g -rdynamic gives the same error. Ok, so now we know it isn't because of a release build as it continues to fail.
It works when running within a gdb debugging session though!!
$ sudo gdb app
(gdb) b malloc_error_break
Function "malloc_error_break" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (malloc_error_break) pending.
(gdb) run -threads 8
Starting program: ~/code/app/app -threads 8
[New Thread 0x1903 of process 45436]
warning: unhandled dyld version (15)
And it runs for hours. CTRL-C, and run ./app -threads 8 and it crashes after a second or two (a few million iterations).
Obviously there's an issue within one of the threads. But those workers for the threads are pretty big (a few hundred lines of code). Nothing stands out.
Note that the threads iterate over loops of about 20 million per second.
macOS 10.12.3
Homebrew w/GNU gcc and openssl (linking to crypto)
Ps, not familiar with C too much - especially any type of debugging. Be kind and expressive/verbose in answers. :)
One debugging technique that is sometimes overlooked is to include debug prints in the code, of course it has it's disadvantages, but also it has advantages. A thing you must keep in mind though in the face of abnormal termination is to make sure the printouts actually get printed. Often it's enough to print to stderr (but if that doesn't make the trick one may need to fflush the stream explicitly).
Another trick is to stop the program before the error occurs. This requires you to know when the program is about to crash, preferably as close as possible. You do this by using raise:
raise(SIGSTOP);
This does not terminate the program, it just suspends execution. Now you can attach with gdb using the command gdb <program-name> <pid> (use ps to find the pid of the process). Now in gdb you have to tell it to ignore SIGSTOP:
> handle SIGSTOP ignore
Then you can set break-points. You can also step out of the raise function using the finish command (may have to be issued multiple times to return to your code).
This technique makes the program have normal behaviour up to the time you decide to stop it, hopefully the final part when running under gdb would not alter the behavior enuogh.
A third option is to use valgrind. Normally when you see these kind of errors there's errors involved that valgrind will pick up. These are accesses out of range and uninitialized variables.
Many memory managers initialise memory to a known bad value to expose problems like this (e.g. Microsoft's CRT will use a range of values (0xCD means uninitialised, 0xDD means already free etc).
After each use of malloc, try memset'ing the memory to 0xCD (or some other constant value). This will allow you to identify uninitialised memory more easily with the debugger. don't use 0x00 as this is a 'normal' value and will be harder to spot if it's wrong (it will also probably 'fix' your problem).
Something like:
void *memory = malloc(sizeof(my_object));
memset(memory, 0xCD, sizeof(my_object));
If you know the size of the blocks, you could do something similar before free (this is sometimes harder unless you know the size of your objects, or track it in some way):
memset(memory, 0xDD, sizeof(my_object));
free(memory);

SIGSEGV Segmentation fault, different message

I am trying to run the program to test buffer overflow, but when program crashes it shows me SIGSEGV error as follows:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004006c0 in main (argc=2, argv=0x7fffffffde78)
But the tutorial which I am following is getting the below message:
Program received signal SIGSEGV, Segmentation fault. 0x41414141 in ??
()
Due to this I am not able to get the exact memory location of buffer overflow.
I have already used -fno-stack-protector while compiling my program. because before this I was getting SIGABRT error.
Does anyone have any clue so that i can get in sync with the tutorial.
I was able to figure out the difference in both.
Actually I was trying the same code on Ubuntu 64-bit on virtual box.
But then I tried installing Ubuntu 32-bit on virtual box, so now I am also getting the same message as what was coming in the tutorial.
Also another difference which I noticed in 64 bit and 32-bit OS is that when using 32 bit we can examine the stack using $esp but in 64-bit machine we have to use $rsp
SIGSEGV is the signal raised when your program attempts to access a memory location where it is not supposed to do. Two typical scenarios are:
Deference a non-initialized pointer.
Access an array out-of-bound.
Note, however, even in these two cases, there is no guarantee that SIGSEGV always happen. So don't expect that SIGSEGV message is always the same even with the same code.

how to return to main function in gdb

I'm using gdb for debugging
I get a segmentation fault, and then I want to set another break point in the main function and run the program from the beginning
however, although I have finished the current run
and it shows "THe program is not being run"
when I input 'list'
it shows a code snippet of a libarary file
it means currently I'm not in the main function
If I re-run the program, even if I set the break point at the beginning of the main()
it still get segmentation fault, it means the program is running within the library file
so how to return to the main() function?
thanks!
tips: I'm using libpcap.h and I have a '-lpcap' option when compiling
BTW, when I use break 9
to set a breakpoint at 9, gdb runs the program to the 11-th line? what is wrong with this inaccuracy? thanks!
Simply re-issue the run command. You will lose program state, but not breakpoints which seems to match what you need.
"BTW, when I use break 9 to set a breakpoint at 9, gdb runs the program to the 11-th line" - from this, and other information you've provided, it sounds like perhaps the source code is out of sync with gdb's mapping of addresses to source lines. Have you by any chance been editing the program? Have you recompiled it and restarted gdb? Have you seen any warnings similar to "executable is more recent than source"?
If I re-run the program, even if I set the break point at the
beginning of the main() it still get segmentation fault, it means the
program is running within the library file
Actually it means that you either failed to set breakpoint on main function or program execution not reaches main and gets segmentation fault. Try the following steps:
Rebuild program from scratch with debug info (-g gcc option). Reset breakpoint and watch for any warnings from gdb.
If program still crashes with breakpoint set on main look at stack trace (bt command in gdb). It is probably happening before main and you will not see main in stack trace.

Getting better debug when Linux crashes in a C programme

We have an embedded version of Linux kernel running on a MIPs core. The Programme we have written runs a particular test suite. During one of the stress tests (runs for about 12hrs) we get a seg fault. This in turn generates a core dump.
Unfortunately the core dump is not very useful. The crash is in some system library that is dynamically linked (probably pthread or glibc). The backtrace in the core dump is not helpful because it only shows the crash point and no other callers (our user space app is built with -g -O0, but still no back trace info):
Cannot access memory at address 0x2aab1004
(gdb) bt
#0 0x2ab05d18 in ?? ()
warning: GDB can't find the start of the function at 0x2ab05d18.
GDB is unable to find the start of the function at 0x2ab05d18
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
This problem is most likely caused by an invalid program counter or
stack pointer.
However, if you think GDB should simply search farther back
from 0x2ab05d18 for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
Another unfortunate-ness is that we cannot run gdb/gdbserver. gdb/gdbserver keeps breaking on __nptl_create_event. Seeing that the test creates threads, timers and destroys then every 5s it is almost impossible to sit for a long time hitting continue on them.
EDIT:
Another note, backtrace and backtrace_symbols is not supported on our toolchain.
Hence:
Is there a way of trapping seg fault and generate more backtrace data, stack pointers, call stack, etc.?
Is there a way of getting more data from a core dump that crashed in a .so file?
Thanks.
GDB can't find the start of the function at 0x2ab05d18
What is at that address at the time of the crash?
Do info shared, and find out if there is a library that contains that address.
The most likely cause of your troubles: did you run strip libpthread.so.0 before uploading it to your target? Don't do that: GDB requires libpthread.so.0 to not be stripped. If your toolchain contains libpthread.so.0 with debug symbols (and thus too large for the target), run strip -g on it, not a full strip.
Update:
info shared produced Cannot access memory at address 0x2ab05d18
This means that GDB can not access the shared library list (which would then explain the missing stack trace). The most usual cause: the binary that actually produced the core does not match the binary you gave to GDB. A less common cause: your core dump was truncated (perhaps due to ulimit -c being set too low).
If all else fails run the command using the debugger!
Just put "gdb" in form of your normal start command and enter "c"ontinue to get the process running. When the task segfaults it will return to the interactive gdb prompt rather than core dump. You should then be able to get more meaningful stack traces etc.
Another option is to use "truss" if it is available. This will tell you which system calls were being used at the time of the abend.

fatal error disappeared when running with gdb

I have a program which produces a fatal error with a testcase, and I can locate the problem by reading the log and the stack trace of the fatal - it turns out that there is a read operation upon a null pointer.
But when I try to attach gdb to it and set a breakpoint around the suspicious code, the null pointer just cannot be observed! The program works smoothly without any error.
This is a single-process, single-thread program, I didn't experience this kind of thing before. Can anyone give me some comments? Thanks.
Appended: I also tried to call pause() syscall before the fatal-trigger code, and expected to make the program sleep before fatal point and then attach the gdb on it on-the-fly, sadly, no fatal occurred.
It's only guesswork without looking at the code, but debuggers sometimes do this:
They initialize certain stuff for you
The timing of the operations is changed
I don't have a quote on GDB, but I do have one on valgrind (granted the two do wildly different things..)
My program crashes normally, but doesn't under Valgrind, or vice versa. What's happening?
When a program runs under Valgrind,
its environment is slightly different
to when it runs natively. For example,
the memory layout is different, and
the way that threads are scheduled is
different.
Same would go for GDB.
Most of the time this doesn't make any
difference, but it can, particularly
if your program is buggy.
So the true problem is likely in your program.
There can be several things happening.. The timing of the application can be changed, so if it's a multi threaded application it is possible that you for example first set the ready flag and then copy the data into the buffer, without debugger attached the other thread might access the buffer before the buffer is filled or some pointer is set.
It's could also be possible that some application has anti-debug functionality. Maybe the piece of code is never touched when running inside a debugger.
One way to analyze it is with a core dump. Which you can create by ulimit -c unlimited then start the application and when the core is dumped you could load it into gdb with gdb ./application ./core You can find a useful write-up here: http://www.ffnn.nl/pages/articles/linux/gdb-gnu-debugger-intro.php
If it is an invalid read on a pointer, then unpredictable behaviour is possible. Since you already know what is causing the fault, you should get rid of it asap. In general, expect the unexpected when dealing with faulty pointer operations.

Resources