Debugging a crash within C preprocessor macro with gdb - c

I have a C program with a multi-line macro and the program crashes within the macro, how can I pinpoint the location within the macro where the crash happens?
Here is a simplified version of my program. In reality CRASHES is multiple lines long and not easily expandable manually.
#include <stdio.h>
#include <stdarg.h>
#define CRASHES(ptr) \
(*(ptr) == 123)
main()
{
char *foo = NULL;
if (CRASHES(foo))
printf("This will never happen.");
}
When compiling and running this with gdb a.out I get the expected EXC_BAD_ACCESS (I am on Mac OS X with gdb 6.3), however the crash points to line 8 and not line 4 where the crash is actually caused.
I already tried compiling the program with additional debugging flags -gdwarf-2 and -g3 as suggested by the docs and inserted several assert()s within the macro itself. Unfortunately that did not provide more information.

lots of valuable information here about macro debugging.
...another approach is to use the preprocessor, i.e. compile it using -E and copy-paste the expanded macro into your src-code and see if you can debug from there.

Of course this crashes since you are deferencing a NULL pointer...(it was not this the question right?). With this particular example, it is easy: gcc -g2, and gdb says
Program received signal SIGSEGV, Segmentation fault.
0x080483d9 in main () at crash.c:10
10 if (CRASHES(foo))
which is rather clear, you expand by yourself the macro and see why (since *foo == 123 access memory you can't read, since foo is NULL). In more complex cases, gcc -E helps, or avoid using macros.

You don't say anything about how it crashes. If it's a segfault, be aware that it might occur a bit later than when you actually dereferenced the bad pointer value.

I there any way you could convert it to an actual function? This is one of the great evils of macros.

Related

memcpy behaves differently with optimization flags compared to without

Consider this demo programme:
#include <string.h>
#include <unistd.h>
typedef struct {
int a;
int b;
int c;
} mystruct;
int main() {
int TOO_BIG = getpagesize();
int SIZE = sizeof(mystruct);
mystruct foo = {
123, 323, 232
};
mystruct bar;
memset(&bar, 0, SIZE);
memcpy(&bar, &foo, TOO_BIG);
}
I compile this two ways:
gcc -O2 -o buffer -Wall buffer.c
gcc -g -o buffer_debug -Wall buffer.c
i.e. the first time with optimizations enabled, the second time with debug flags and no optimization.
The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.
Secondly, running the first programme produces:
*** buffer overflow detected ***: terminated
Aborted (core dumped)
whereas the second produces
*** stack smashing detected ***: terminated
Aborted (core dumped)
or, and you'll have to believe me here since I can't reproduce this with the demo programme, sometimes no warning at all. The programme doesn't even interrupt, it runs as normal. This was a behaviour I encountered with some more complex code, which made it difficult to debug until I realised that there was a buffer overflow happening.
My question is: why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?
..I can't reproduce this with the demo program, sometimes no warning at all...
The undefined behavior directives are very broad, there is no requirement for the compiler to issue any warnings for a program that exhibits this behavior:
why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?
Compiler optimizations tend to optimize away unused variables, if I compile your code with optimizations enabled I don't get a segmentation fault, looking at the assembly (link above), you'll notice that the problematic variables are optimized away, and memcpy doesn't get called, so there is no reason for it to not compile successfuly, the program exits with success code 0, whereas if don't optimize it, the undefined behavior manifests itself, and the program exits with code 139, classic segmentation fault exit code.
As you can see these results are different from yours and that is one of the features of undefined behavior, different compilers, systems or even compiler versions can behave in a completely different way.
Accessing memory behind what's been allocated is undefined behavior, which means the compiler is allowed to do anything. When there are no optimizations, the compiler may try to guess and do something reasonable. When optimizations are turned on, the compiler may take advantage of the fact that any behavior is allowed to do something that runs faster.
The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.
That is the programmer's responsibility to fix, not the compiler. You'll be very lucky if a compiler manages to find potential buffer overflows for you. Its job is to check that your code is valid C then translate it to machine code.
If you want a tool that catches bugs, they are called static analysers and that's a different type of program. At some extent, static analysis might be integrated in a compiler as a feature. There is one for clang, but most static analysers are commercial tools and not open source.
Secondly, running the first programme produces: ... whereas the second produces
Undefined behavior simply means there is no defined behavior. What is undefined behavior and how does it work?. Meaning there's not likely anything to learn from examining the results, no interesting mystery to solve. In one case it apparently accessed forbidden memory, in the other case it mangled a poor little "stack canary". The difference will be related to different memory layouts. Who cares - bugs are bugs. Focus on why the bug happened (you already know!), instead of trying to make sense of the undefined results.
Now when I run your code with optimizations actually enabled for real (gcc -O2 on an x86 Linux), the compiler gives me
main:
subq $8, %rsp
call getpagesize
xorl %eax, %eax
addq $8, %rsp
ret
With optimizations actually enabled, it didn't even bother calling memcpy & friends because there are no side effects and the variables aren't used, so they can be safely removed from the executable.

Mudflap and pointer arrays

I've just implemented a pretty complicated piece of software, but my school's testing system won't take it.
The system uses the so-called mudflap library which should be able to prevent illegal memory accesses better. As a consequence, my program generates segfaults when run on the school's testing system (I submit the source code and the testing system compiles it for itself, using the mudflap library).
I tried to isolate the problematic code in my program, and it seems that it all boils down to something as simple as pointer arrays. Mudflap doesn't seem to like them.
Below is a piece of some very simple code with that works with a pointer array:
#include <stdlib.h>
int main()
{
char** rows;
rows=(char**)malloc(sizeof(char*)*3);
rows[0]=(char*)malloc(sizeof(char)*4);
rows[1]=(char*)malloc(sizeof(char)*4);
rows[2]=(char*)malloc(sizeof(char)*4);
strcpy(rows[0], "abc");
strcpy(rows[1], "abc");
strcpy(rows[2], "abc");
free(rows[0]); free(rows[1]); free(rows[2]);
free(rows);
return 0;
This will generate a segfault with mudflap. In my opinion, this is a perfectly legal code.
Could you please explain to me what is wrong with it, and why it generates a segfault with mudflap?
Note: The program should be compiled under an amd64 linux system with g++ using the following commands:
export MUDFLAP_OPTIONS='-viol-segv -print-leaks';
g++ -Wall -pedantic -fmudflap -fmudflapir -lmudflap -g file.cpp
You have at least one problem here:
char** rows;
rows=(char**)malloc(3);
This allocates 3 bytes. On most platforms the allocator probably has a minimum of at least 4 bytes which lets you get away with overwriting the buffer a bit. I'm guessing your mudflap library is more strict in its checking and catches the overwrite.
However, if you want an array of 3 char * pointers, you probably need at least 12 bytes.
Try changing these lines to:
char** rows;
rows=(char**)malloc(3 * sizeof(char *));
EDIT: Based on your modified code, I agree it looks correct now. The only thing I can suggest is that perhaps malloc() is failing and causing a NULL pointer access. If thats not the case it sounds like a bug or misconfiguration of mudflap.

Why don't i get "Segmentation Fault"? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why don’t I get a segmentation fault when I write beyond the end of an array?
This code compiles and runs without any error. But how?
#include <stdio.h>
int main (void)
{
int foo[2];
foo[8] = 4; /* How could this happen? */
printf("%d\n", foo[8]);
return 0;
}
I'm compiling with GCC 4.7.2 on Arch Linux x86_64.
gcc -Wall -o "main" "main.c"
Because undefined behavior doesn't mean "you will receive a segfault", that would be defined behavior.
Let's assume you're running in debug mode and your compiler is padding your stack/local variable space. You're probably just writing into some unused part of the stack space.
Build a release version on a Monday when your compiler is feeling cranky and now you overwrite the return address, or the code that sets up the call to printf, whatever. Oops.
Just one possible outcome, but you get the idea.
foo[8] may be allocated for your program (padding purpose, for instance), belong to your operating system. With an undefined behavior, anything can happen; you are unlucky, because it works.
Try
foo[1000000]=42;
and see what happens.

Easiest way to locate a Segmentation Fault

I encountered my first Segmentation Fault today (newbie programmer). After reading up on what a segmentation fault is (Thanks for all of the helpful info on this site, as well as Wikipedia's lengthy explanation), I'm trying to determine the easiest way to go about finding where my fault is occuring. It's written in C and the error is occuring on a *NIX based system (I'm not sure which one to be honest... 99% sure it's Linux). I can't exactly post my code as I have numerous files that I'm compiling that are all quite lengthy. I was just hoping for some best practices you have all observed. Thanks for your help.
P.s. I'm thinking the error is coming from dereferencing a NULL pointer or using an uninitialized pointer. However, I could definitely be wrong.
Use a debugger, such as gdb or if this is not applicable a strace tool to get a better insight into where the segfault occurs.
If you use gcc, make sure you compile with -g switch to include debugging information. Then, gdb will show you the exact location in a source code where it segfaults.
For example, if we have this obvious segfaulty program:
new.c
#include <stdio.h>
int main()
{
int *i = 0x478734;
printf("%d", *i);
}
We compile it with gcc -g new.c -o new and then run the gdb session with gdb new:
We issue the run command in the interactive session and the else is clear:
(gdb) run
Starting program: /home/Tibor/so/new
[New Thread 9596.0x16a0]
[New Thread 9596.0x1de4]
Program received signal SIGSEGV, Segmentation fault.
0x0040118a in main () at new.c:6
6 printf("%d", *i);
(gdb)
As DasMoeh and netcoder have pointed out, when segfault has occured, you can use the backtrace command in the interactive session to print a call stack. This can aid in further pinpointing the location of a segfault.
The easiest way is to use valgrind. It will pinpoint to the location where the invalid access occours (and other problems which didn't cause crash but were still invalid). Of course the real problem could be somewhere else in the code (eg: invalid pointer), so the next step is to check the source, and if still confused, use a debugger.
+1 for Tibors answer.
On larger programs or if you use additional libraries it may also be useful look at the backtrace with gdb: ftp://ftp.gnu.org/pub/old-gnu/Manuals/gdb/html_node/gdb_42.html
I reopen this posts for people passing here since I've just corrected a segfault I've made using gcc.
You should consider using the flag -fsanitize=address which can sometimes highlight your segfault with high precision.

__libc_lock_lock is segfaulting

I am working on a piece of code which uses regular expressions in c.
All of the regex stuff is using the standard regex c library.
On line 246 of regexec.c, the line is
__libc_lock_lock(dfa->lock);
My program is segfaulting here and I cannot figure out why. I was trying to find where __libc_lock_lock was defined and it turns out it is a macro in bits/libc-lock.h. However, the macro isnt actually defined to be anything, just defined.
Two questions:
1) Where is the code that is run when __libc_lock_lock is called (I know it must be
replaced with something but I dont know where that would be.
2) if dfa is a re_dfa_t object which is casted from a c string which is the buffer member of the regex_t object type, it will not have any member lock. Is this what is supposed to happen.
It really seams like there is some kind of magic going on here with this __libc_lock_lock
If the segfault is in libc then you can be 99.9% sure of the following:
You are doing something wrong with the API
You have at some previous point clobbered or corrupted memory used by libc, and this is a delayed effect. (Thanks Tyler!)
You are doing something that is pushing the API's capability
You are a developer testing the current trunk with new changes in the API implementation
I suspect that the first is the cause. Posting your API usage and your library version might help. The Regexp API in libc is pretty stable.
Look up debugging with gdb to find a stack trace of the execution path leading to the segfault, and install the glibc-devel packages for the symbols. If the segfault is in (or out) of libc ... then you have done something bad (not initialized an opaque pointer for example)
[aiden#devbox ~]$ gdb ./myProgram
(gdb) r
... Loads of stuff, segfault info ..
(gdb) bt
Will print the stack and function-names that led to the segault. Compile your source with the '-g' debug flag to keep important debugging information.
Get an authoritative source for API usage/examples!
Good Luck
In answer to your first question:
The macro is defined in the libc-lock.h; its relative path is sysdeps/mach/bits on
the glibc release I use (2.2.5). Lines 67/68 from that file are
/* Lock the named lock variable. */
#define __libc_lock_lock(NAME) __mutex_lock (&(NAME))
Run your code in gdb until you get to the segfault. Then do a backtrace to find out where it was.
Here is the set of commands you will type to do this:
gdb myprogram
run
***Make it crash***
backtrace
Typing backtrace will print the call stack and will show you what path the code has taken to get to the point where it is segfaulting.
You can go up and down in the stack to your code by typing 'up' or 'down' respectively. Then you can examine variables in that scope.
So for instance, if your backtrace command prints this:
linux_black_magic
more_linux
libc
libc
yourcode.c
Type 'up' a few times so that the stack frame is in your code instead of linux's. You can then examine variables and memory that your program is operating on. Do this:
print VariableName
x/10 &Variable
That will print the value of the variable and then will print a hex dump of memory starting at the variable.
Those are some general techniques to use with gdb and debugging, post more details for more detailed answers.

Resources