C: Segmentation fault and maybe GDB is lying to me - c

Here is a C function that segfaults:
void compileShaders(OGL_STATE_T *state) {
// First testing to see if I can access object properly. Correctly outputs:
// nsHandle: 6
state->nsHandle = 6;
printf("nsHandle: %d\n", state->nsHandle);
// Next testing if glCreateProgram() returns proper value. Correctly outputs:
// glCreateProgram: 1
printf("glCreateProgram: %d\n", glCreateProgram());
// Then the program segfaults on the following line according to gdb
state->nsHandle = glCreateProgram();
}
For the record state->nsHandle is of type GLuint and glCreateProgram() returns a GLuint so that shouldn't be my problem.
gdb says that my program segfaults on line 303 which is actually the comment line before that line. I don't know if that actually matters.
Is gdb lying to me? How do I debug this?
EDIT:
Turned off optimizations (-O3) and now it's working. If somebody could explain why that would be great though.
EDIT 2:
For the purpose of the comments, here's a watered down version of the important components:
typedef struct {
GLuint nsHandle;
} OGL_STATE_T;
int main (int argc, char *argv[]) {
OGL_STATE_T _state, *state=&_state;
compileShaders(state);
}
EDIT 3:
Here's a test I did:
int main(int argc, char *argv[]) {
OGL_STATE_T _state, *state=&_state;
// Assign value and try to print it in other function
state->nsHandle = 5;
compileShaders(state);
}
void compileShaders(OGL_STATE_T *state) {
// Test to see if the first call to state is getting optimized out
// Correctly outputs:
// nsHandle (At entry): 5
printf("nsHandle (At entry): %d\n", state->nsHandle);
}
Not sure if that helps anything or if the compiler would actually optimize the value from the main function.
EDIT 4:
Printed out pointer address in main and compileShaders and everything matches. So I'm gonna assume it's segfaulting somewhere else and gdb is lying to me about which line is actually causing it.

This is going to be guesswork based on what you have, but with optimization on this line:
state->nsHandle = 6;
printf("nsHandle: %d\n", state->nsHandle);
is probably optimized to just
printf("nsHandle: 6\n");
So the first access to state is where the segfault is. With optimization on GDB can report odd line numbers for where the issue is because the running code may no longer map cleanly to source code lines as you can see from the example above.
As mentioned in the comments, state is almost certainly not initialized. Some other difference in the optimized code is causing it to point to an invalid memory area whereas the non-optimized code it's pointing somewhere valid.
This might happen if you're doing something with pointers directly that prevents the optimizer from 'seeing' that a given variable is used.
A sanity check would be useful to check that state != 0 but it'll not help if it's non-zero but invalid.
You'd need to post the calling code for anyone to tell you more. However, you asked how to debug it -- I would print (or use GDB to view) the value of state when that function is entered, I imagine it will be vastly different in optimized and non-optimized versions. Then track back to the function call to work out why that's the case.
EDIT
You posted the calling code -- that should be fine. Are you getting warnings when compiling (turn all the warnings on with -Wall). In any case my advice about printing the value of state in different scenarios still stands.
(removed comment about adding & since you edited the question again)

When you optimize your program, there is no more 1:1 mapping between source lines and emmitted code.
Typically, the compiler will reorder the code to be more efficient for your CPU, or will inline function call, etc...
This code is wrong:
*state=_state
It should be:
*state=&_state
Well, you edited your post, so ignore the above fix.

Check for the NULL condition before de-referencing the pointer or reading it. If the values you pass are NULL or if the values stored are NULL then you will hit segfault without performing any checks.
FYI: GDB Can't Lie !

I ended up starting a new thread with more relevant information and somebody found the answer. New thread is here:
GCC: Segmentation fault and debugging program that only crashes when optimized

Related

How to 'tag' a location in a C source file for a later breakpoint definition?

Problem:
I want to be able to put different potentially unique or repeated "tags" across my C code, such that I can use them in gdb to create breakpoints.
Similar Work:
Breakpoints to line-numbers: The main difference with breakpoints on source lines, is that if the code previous to the tag is modified in such a way that it results in more or less lines, a reference to the tag would still be semantically correct, a reference to the source line would not.
Labels: I am coming from my previous question, How to tell gcc to keep my unused labels?, in which I preconceived the idea that the answer was to insert labels. Upon discussion with knowledgeable members of the platform, I was taught that label's names are not preserved after compilation. Labels not used within C are removed by the compiler.
Injecting asm labels: Related to the previous approach, if I inject asm code in the C source, certain problems arise, due to inline functions, compiler optimizations, and lack of scoping. This makes this approach not robust.
Define a dummy function: On this other question, Set GDB breakpoint in C file, there is an interesting approach, in which a "dummy" function can be placed in the code, and then add a breakpoint to the function call. The problem with this approach is that the definition of such function must be replicated for each different tag.
Is there a better solution to accomplish this? Or a different angle to attack the presented problem?
Using SDT (Statically Defined Tracing) probe points appears to satisfy all the requirements.
GDB documentation links to examples of how to define the probes.
Example use: (gdb) break -probe-stap my_probe (this should be documented in the GDB breakpoints section, but currently isn't).
You could create a dummy variable and set it to different values. Then you can use conditional watchpoints. Example:
#include <stdio.h>
static volatile int loc;
int main()
{
loc = 1;
puts("hello world");
loc = 2;
return 0;
}
(gdb) watch loc if loc == 2
Hardware watchpoint 1: loc
(gdb) r
Starting program: /tmp/a.out
hello world
Hardware watchpoint 1: loc
Old value = 1
New value = 2
main () at test.c:8
8 return 0;
You can of course wrap the assignment in a macro so you only get it in debug builds. Usual caveats apply: optimizations and inlining may be affected.
Use python to search a source file for some predefined labels, and set breakpoints there:
def break_on_labels(source, label):
"""add breakpoint on each SOURCE line containing LABEL"""
with open(source) as file:
l = 0
for line in file:
l = l + 1
if label in line:
gdb.Breakpoint(source=source, line=l)
main_file = gdb.lookup_global_symbol("main").symtab.fullname()
break_on_labels(main_file, "BREAK-HERE")
Example:
int main(void)
{
int a = 15;
a = a + 23; // BREAK-HERE
return a;
}
You could insert a #warning at each line where you want a breakpoint, then have a script to parse the file and line numbers from the compiler messages and write a .gdbinit file placing breakpoints at those locations.

VS2010, scanf, strange behaviour

I'm converting some source from VC6 to VS2010. The code is written in C++/CLI and it is an MFC application. It includes a line:
BYTE mybyte;
sscanf(source, "%x", &mybyte);
Which is fine for VC6 (for more than 15 years) but causing problems in VS2010 so I created some test code.
void test_WORD_scanf()
{
char *source = "0xaa";
char *format = "%x";
int result = 0;
try
{
WORD pre = -1;
WORD target = -1;
WORD post = -1;
printf("Test (pre scan): stack: pre=%04x, target=%04x, post=%04x, sourse='%s', format='%s'\n", pre, target, post, source, format);
result = sscanf(source, format, &target);
printf("Test (post scan): stack: pre=%04x, target=%04x, post=%04x, sourse='%s', format='%s'\n", pre, target, post, source, format);
printf("result=%x", result);
// modification suggested by Werner Henze.
printf("&pre=%x sizeof(pre)=%x, &target=%x, sizeof(target)=%x, &post=%x, sizeof(post)=%d\n", &pre, sizeof(pre), &target, sizeof(target), &post, sizeof(post));
}
catch (...)
{
printf("Exception: Bad luck!\n");
}
}
Building this (in DEBUG mode) is no problem. Running it gives strange results that I cannot explain. First, I get the output from the two printf statemens as expected. Then a get a run time waring, which is the unexpected bit for me.
Test (pre scan): stack: pre=ffff, target=ffff, post=ffff, source='0xaa', format='%x'
Test (post scan): stack: pre=ffff, target=00aa, post=ffff, source='0xaa', format='%x'
result=1
Run-Time Check Failure #2 - Stack around the variable 'target' was corrupted.
Using the debugger I found out that the run time check failure is triggered on returning from the function. Does anybody know where the run time check failure comes from? I used Google but can't find any suggestion for this.
In the actual code it is not a WORD that is used in sscanf but a BYTE (and I have a BYTE version of the test function). This caused actual stack corruptions with the "%x" format (overwriting variable pre with 0) while using "%hx" (what I expect to be the correct format) is still causing some problems in overwriting the lower byte of variable prev.
Any suggestion is welcome.
Note: I edited the example code to include the return result from sscanf()
Kind regards,
Andre Steenveld.
sscanf with %x writes an int. If you provide the address of a BYTE or a WORD then you get a buffer overflow/stack overwrite. %hx will write a short int.
The solution is to have an int variable, let sscanf write to that and then set your WORD or BYTE variable to the read value.
int x;
sscanf("%x", "0xaa", x);
BYTE b = (BYTE)x;
BTW, for your test and the message
Run-Time Check Failure #2 - Stack around the variable 'target' was corrupted.
you should also print out the addresses of the variables and you'll probably see that the compiler added some padding/security check space between the variables pre/target/post.

Continuous "undefined reference to..."

I'm working on a program and I keep getting "undefined reference to 'dosell' " and I can't quite figure out what is going on. Here's the declaration of the function:
void dosell(int *cash, int *numchips);
The use of the function:
choice = menu();
// Execute the appropriate choice.
if (choice == 1) {
dobuy(&cash, &numchips);
}
else if (choice == 2) {
dosell(&cash, &numchips);
}
And the function itself:
void dosell(int *cash, int *numchips) {
int numsell;
// Determine the number of chips to be sold.
printf("How many chips do you want to sell?\n");
scanf("%d", &numsell);
// Print out the error message if this is too much.
if (numsell > *numchips)
printf("Sorry, you do not have that many chips. No chips sold.\n");
// Execute the transaction.
else {
(*cash) += sellchips(numsell);
(*numchips) -= numsell;
}}}
Put a declaration of the function at the beginning of the file i.e. before the main function and after the includes and defines:
void dosell(int *cash, int *numchips) ;
Transferring key comments into an answer
Is dosell() in the same file as the call to it? If not, are you linking both (all) the files to create the program?
What's with the }}} at the end of dosell(); it looks like a syntax error, unless you've accidentally managed to use a GCC extension — nested functions.
It actually takes quite a bit of effort to make GCC give you a warning about a nested function. You can do it by specifying a standard such as -std=c11 and -pedantic. However, you should not plan on using nested functions, especially not by accident.
I took a look at the }}} and what turns out is one of those was misplaced. One of brackets should have been at the end of the dobuy() function, which is immediately before the dosell() function. Because of this, it included dosell() within dobuy(), so it was as if I hadn't even written the dosell() function.
I observe that if your code had been indented by an automatic indenter, you would have seen that the start line for dosell() was indented, which would perhaps have tipped you off that there was something amiss. The symptoms you describe are exactly consistent with a nested function.

Counter variable passed to `msgpack_pack_int()` macro doesn't increment

I have a very odd issue trying to run this quite simple C program which is using zmq and msgpack.
There is no problem with server.c, however in clinet.c:39 there
is this msgpack_pack_int (&mpkg, i); and the value of i seems
to be picked up as 0 and doesn't change on each iteration. I
have tried a bunch of different things (e.g. making a pointer to
i and using that, also tried to split it into a function etc)
and nothing seems to help. I can see that msgpack_pack_int() is
a macro, but why it would introduce such a behaviour and what can
I do to overcome it? Is there a flag that could change the behaviour
of this kind of macro (as I see it expands to an inline function)...
I have tried -Werror -Wall, with gcc and clang, and nothing
comes up in warning either ;(*
I tried debugging it and i increments as expected.
I even tried this, and it would do the same thing anyway:
void pack (msgpack_packer *p, msgpack_sbuffer *b) {
static volatile int i = 0;
printf("\ni=%d\n", i);
msgpack_packer_init (p, b, msgpack_sbuffer_write);
msgpack_pack_array (p, 2);
msgpack_pack_int (p, i++);
msgpack_pack_str (p, "/i/am/a/clinet/");
}
I have even tried something which was supposed to be different, but no luck here either -
int count (void) {
static int i = 0;
i += 1; return i;
}
can anyone see why would this happen?
Update 1: Also I have recompiled msgpack library itself without optimization flags,
and that didn't change the behaviour either.
Update 2: Installed msgpack from the git repo and I get still have the same issue.
It turns out that on each iteration I was doing this:
msgpack_packer_init (&mpkg, &sbuf, msgpack_sbuffer_write);
that needs to be done only once, and this should be there instead:
msgpack_sbuffer_init (&sbuf);
or:
msgpack_sbuffer_clear (&sbuf);
It was rather logical to put msg_pack* functions together and indeed
that was taken from the simple example and
the problem is really with the documentation, one extra comment would help!
Update: working version & version without memcpy.

I have a function with a lot of return points. Is there any way that I can make gdb show me which one is returning?

I have a function with an absurd number of return points, and I don't want to caveman each one, and I don't want to next through the function. Is there any way I can do something like finish, except have it stop on the return statement?
You can try reverse debugging to find out where function actually returns. Finish executing current frame, do reverse-step and then you should stop at just returned statement.
(gdb) fin
(gdb) reverse-step
There is already similar question
I think you're stuck setting breakpoints. I'd write a script to generate the list of breakpoint commands to run and paste them into gdb.
Sample script (in Python):
lines = open(filename, 'r').readlines()
break_lines = [line_num for line_num, line in enumerate(lines) if 'return' in line and
line_num > first and line_num <= last]
break_cmds = ['b %s:%d' % (filename, line_num) for line_num in break_lines]
print '\n'.join(break_cmds)
Set filename to the name of the file with the absurd function, first to the first line of the function (this is a quick script, not a C parser) and last to the number of the last line of the function. The output ought to be suitable for pasting into gdb.
Kind of a stretch, but the catch command can stop on many kinds of things (like forking, exiting, receiving a signal). You may be able to use catch catch (which breaks for exceptions) to do what you want in C++ if you wrap the function in try/finally. For that matter, if you break on a line inside the finally you can probably single-step through the return after that (although how much that will tell you about where it came from is highly dependent on optimization: common return cases are often folded by gcc).
How about taking this opportunity to break up what seems to be clearly a too-large function?
This question's come up before on SO. My answer from there:
Obviously you ought to refactor this function, but in C++ you can use this simple expedient to deal with this in five minutes:
class ReturnMarker
{
public:
ReturnMarker() {};
~ReturnMarker()
{
dummy += 1; //<-- put your breakpoint here
}
static int dummy;
}
int ReturnMarker::dummy = 0;
and then instance a single ReturnMarker at the top of your function. When it returns, that instance will go out of scope, and you'll hit the destructor.
void LongFunction()
{
ReturnMarker foo;
// ...
}

Resources