I have an issue with my pointer to a structure variable. I just started using GDB to debug the issue. The application stops when it hits on the line of code below due to segmentation fault. ptr_var is a pointer to a structure
ptr_var->page = 0;
I discovered that ptr_var is set to an invalid memory 0x0 after a series of function calls which caused the segmentation fault when assigning the value "0" to struct member "page". The series of function calls does not have a reference to ptr_var. The old address that used to be assigned to ptr_var is still in memory. I can still still print the values of members from the struct ptr_var using the old address. GDB session below shows that I am printing a string member of the struct ptr_var using its address
(gdb) x /s *0x7e11c0
0x7e0810: "Sample String"
I couldn't tell when the variable ptr_var gets assigned an invalid address 0x0. I'm a newbie to GDB and an average C programmer. Your assistance in this matter is greatly appreciated. Thank you.
What you want to do is set a watchpoint, GDB will then stop execution every time a member of a struct is modified.
With the following example code
typedef struct {
int val;
} Foo;
int main(void) {
Foo foo;
foo.val = 5;
foo.val = 10;
}
Drop a breakpoint at the creation of the struct and execute watch -l foo.val Then every time that member is changed you will get a break. The following is my GDB session, with my input
(gdb) break test.c:8
Breakpoint 3 at 0x4006f9: file test.c, line 8.
(gdb) run
Starting program: /usr/home/sean/a.out
Breakpoint 3, main () at test.c:9
9 foo.val = 5;
(gdb) watch -l foo.val
Hardware watchpoint 4: -location foo.val
(gdb) cont
Continuing.
Hardware watchpoint 4: -location foo.val
Old value = 0
New value = 5
main () at test.c:10
(gdb) cont
Continuing.
Hardware watchpoint 4: -location foo.val
Old value = 5
New value = 10
main () at test.c:11
(gdb) cont
If you can rerun, then break at a point where ptr_var is correct you can set a watch point on ptr_var like this: (gdb) watch ptr_var. Now when you continue every time ptr_var is modified gdb should stop.
Here's an example. This does contain undefined behaviour, as I'm trying to reproduce a bug, but hopefully it should be good enough to show you what I'm suggesting:
#include <stdio.h>
#include <stdint.h>
int target1;
int target2;
void
bad_func (int **bar)
{
/* Set contents of bar. */
uintptr_t ptr = (uintptr_t) bar;
printf ("Should clear %p\n", (void *) ptr);
ptr += sizeof (int *);
printf ("Will clear %p\n", (void *) ptr);
/* Bad! We just corrupted foo (maybe). */
*((int **) ptr) = NULL;
}
int
main ()
{
int *foo = &target1;
int *bar = &target2;
printf ("&foo = %p\n", (void *) &foo);
printf ("&boo = %p\n", (void *) &bar);
bad_func (&bar);
return *foo;
}
And here's a gdb session:
(gdb) break bad_func
Breakpoint 1 at 0x400542: file watch.c, line 11.
(gdb) r
&foo = 0x7fffffffdb88
&boo = 0x7fffffffdb80
Breakpoint 1, bad_func (bar=0x7fffffffdb80) at watch.c:11
11 uintptr_t ptr = (uintptr_t) bar;
(gdb) up
#1 0x00000000004005d9 in main () at watch.c:27
27 bad_func (&bar);
(gdb) watch foo
Hardware watchpoint 2: foo
(gdb) c
Continuing.
Should clear 0x7fffffffdb80
Will clear 0x7fffffffdb88
Hardware watchpoint 2: foo
Old value = (int *) 0x60103c <target1>
New value = (int *) 0x0
bad_func (bar=0x7fffffffdb80) at watch.c:18
18 }
(gdb)
For some reason the watchpoint appears to trigger on the line after the change was made, even though I compiled this with -O0, which is a bit of a shame. Still, it's usually close enough to help identify the problem.
For such kind of problems I often use the old electric fence library, it can be used to find bug in "software that overruns the boundaries of a malloc() memory allocation". You will find all the instructions and basic usage at this page:
http://elinux.org/Electric_Fence
(At the very end of the page linked above you will find the download link)
Related
Currently I'm having a problem that when I step into a function in gdb the value of the arg changes. I can not for the life of me figure out what this is. As you can see in the function at large the value of block is 0x800000008. When I print it thats its value and when I inspect the values of args that is its value. Then when I step into write_block for some reason the value of block changes. But only for this function. When I step out the value of block is once again 0x800000008, the correct value. When I step into the next function the value of block is correct again. What gives?
The code is compiled is with the -O0 optimization flag.
Here is a c code snipped from the function mm_malloc in question
if (block == NULL) {
// Always request at least chunksize
extendsize = max(asize, chunksize);
block = extend_heap(extendsize);
// extend_heap returns an error
if (block == NULL) {
return bp;
}
remove_free_block(block); // extend_heap guarentees teh block is
// on the free list
}
// The block should be marked as free
dbg_assert(!get_alloc(block));
// Mark block as allocated
size_t block_size = get_size(block);
write_block(block, block_size, true);
// Try to split the block if too large
split_block(block, asize);
Output from GDB
(gdb) finish
Run till exit from #0 get_alloc (block=0x800000008) at mm.c:399
0x0000000000404b39 in mm_malloc (size=9904) at mm.c:1175
Value returned is $75 = false
(gdb) n
(gdb) s
get_size (block=0x800000008) at mm.c:323
(gdb) p block
$76 = (block_t *) 0x800000008
(gdb) finish
Run till exit from #0 get_size (block=0x800000008) at mm.c:323
0x0000000000404b77 in mm_malloc (size=9904) at mm.c:1178
Value returned is $77 = 14016
(gdb) step
(gdb) p block
$78 = (block_t *) 0x800000008
(gdb) step
Breakpoint 1, write_block (block=0x80000c3c8, size=14016, alloc=true) at mm.c:440
(gdb) p block
$79 = (block_t *) 0x80000c3c8
(gdb) finish
Run till exit from #0 write_block (block=0x80000c3c8, size=14016, alloc=true) at mm.c:440
mm_malloc (size=9904) at mm.c:1182
(gdb) p block
$80 = (block_t *) 0x800000008
(gdb) step
split_block (block=0x800000008, asize=9920) at mm.c:846
(gdb) p block
$81 = (block_t *) 0x800000008
(gdb) ```
When I step into the next function the value of block is correct again. What gives?
The debug info describing to GDB how to find the value of block is either incorrect / inaccurate, or GDB is not interpreting the debug info correctly.
The former is quite common, especially in optimized code, or when using older compilers. Usually executing a few stepis to step past the function prolog makes GDB display correct values again.
I am running an application through gdb and I want to set a breakpoint for any time a specific variable is accessed / changed. Is there a good method for doing this? I would also be interested in other ways to monitor a variable in C/C++ to see if/when it changes.
watch only breaks on write, rwatch let you break on read, and awatch let you break on read/write.
You can set read watchpoints on memory locations:
gdb$ rwatch *0xfeedface
Hardware read watchpoint 2: *0xfeedface
but one limitation applies to the rwatch and awatch commands; you can't use gdb variables
in expressions:
gdb$ rwatch $ebx+0xec1a04f
Expression cannot be implemented with read/access watchpoint.
So you have to expand them yourself:
gdb$ print $ebx
$13 = 0x135700
gdb$ rwatch *0x135700+0xec1a04f
Hardware read watchpoint 3: *0x135700 + 0xec1a04f
gdb$ c
Hardware read watchpoint 3: *0x135700 + 0xec1a04f
Value = 0xec34daf
0x9527d6e7 in objc_msgSend ()
Edit: Oh, and by the way. You need either hardware or software support. Software is obviously much slower. To find out if your OS supports hardware watchpoints you can see the can-use-hw-watchpoints environment setting.
gdb$ show can-use-hw-watchpoints
Debugger's willingness to use watchpoint hardware is 1.
What you're looking for is called a watchpoint.
Usage
(gdb) watch foo: watch the value of variable foo
(gdb) watch *(int*)0x12345678: watch the value pointed by an address, casted to whatever type you want
(gdb) watch a*b + c/d: watch an arbitrarily complex expression, valid in the program's native language
Watchpoints are of three kinds:
watch: gdb will break when a write occurs
rwatch: gdb will break wnen a read occurs
awatch: gdb will break in both cases
You may choose the more appropriate for your needs.
For more information, check this out.
Assuming the first answer is referring to the C-like syntax (char *)(0x135700 +0xec1a04f) then the answer to do rwatch *0x135700+0xec1a04f is incorrect. The correct syntax is rwatch *(0x135700+0xec1a04f).
The lack of ()s there caused me a great deal of pain trying to use watchpoints myself.
I just tried the following:
$ cat gdbtest.c
int abc = 43;
int main()
{
abc = 10;
}
$ gcc -g -o gdbtest gdbtest.c
$ gdb gdbtest
...
(gdb) watch abc
Hardware watchpoint 1: abc
(gdb) r
Starting program: /home/mweerden/gdbtest
...
Old value = 43
New value = 10
main () at gdbtest.c:6
6 }
(gdb) quit
So it seems possible, but you do appear to need some hardware support.
Use watch to see when a variable is written to, rwatch when it is read and awatch when it is read/written from/to, as noted above. However, please note that to use this command, you must break the program, and the variable must be in scope when you've broken the program:
Use the watch command. The argument to the watch command is an
expression that is evaluated. This implies that the variabel you want
to set a watchpoint on must be in the current scope. So, to set a
watchpoint on a non-global variable, you must have set a breakpoint
that will stop your program when the variable is in scope. You set the
watchpoint after the program breaks.
In addition to what has already been answered/commented by asksol and Paolo M
I didn't at first read understand, why do we need to cast the results. Though I read this: https://sourceware.org/gdb/onlinedocs/gdb/Set-Watchpoints.html, yet it wasn't intuitive to me..
So I did an experiment to make the result clearer:
Code: (Let's say that int main() is at Line 3; int i=0 is at Line 5 and other code.. is from Line 10)
int main()
{
int i = 0;
int j;
i = 3840 // binary 1100 0000 0000 to take into account endianness
other code..
}
then i started gdb with the executable file
in my first attempt, i set the breakpoint on the location of variable without casting, following were the results displayed
Thread 1 "testing2" h
Breakpoint 2 at 0x10040109b: file testing2.c, line 10.
(gdb) s
7 i = 3840;
(gdb) p i
$1 = 0
(gdb) p &i
$2 = (int *) 0xffffcbfc
(gdb) watch *0xffffcbfc
Hardware watchpoint 3: *0xffffcbfc
(gdb) s
[New Thread 13168.0xa74]
Thread 1 "testing2" hit Breakpoint 2, main () at testing2.c:10
10 b = a;
(gdb) p i
$3 = 3840
(gdb) p *0xffffcbfc
$4 = 3840
(gdb) p/t *0xffffcbfc
$5 = 111100000000
as we could see breakpoint was hit for line 10 which was set by me. gdb didn't break because although variable i underwent change yet the location being watched didn't change (due to endianness, since it continued to remain all 0's)
in my second attempt, i did the casting on the address of the variable to watch for all the sizeof(int) bytes. this time:
(gdb) p &i
$6 = (int *) 0xffffcbfc
(gdb) p i
$7 = 0
(gdb) watch *(int *) 0xffffcbfc
Hardware watchpoint 6: *(int *) 0xffffcbfc
(gdb) b 10
Breakpoint 7 at 0x10040109b: file testing2.c, line 10.
(gdb) i b
Num Type Disp Enb Address What
6 hw watchpoint keep y *(int *) 0xffffcbfc
7 breakpoint keep y 0x000000010040109b in main at testing2.c:10
(gdb) n
[New Thread 21508.0x3c30]
Thread 1 "testing2" hit Hardware watchpoint 6: *(int *) 0xffffcbfc
Old value = 0
New value = 3840
Thread 1 "testing2" hit Breakpoint 7, main () at testing2.c:10
10 b = a;
gdb break since it detected the value has changed.
#include <stdio.h>
typedef struct ThingStruct {
int arr[8];
int after;
} Thing;
void foo(int i) {
Thing thing;
int* ip = &thing.after;
thing.after = 12345;
printf("beforehand\n");
thing.arr[i] = 55;
printf("done\n");
}
int main() {
foo(8);
}
This code changes thing.after by accidentally going off the end of the array. I want to try to find the line where where thing.after is changing by using gdb. So I compile with -g , put a breakpoint on line 12, then put a watchpoint on thing.after, but the watchpoint doesn't trigger, even though putting a breakpoint on line 14 does show that thing.after did change.
I even tried taking the address of thing.after and setting a watchpoint on that, but it still does not trigger.
Watch point needs to be re-added each time the foo function is entered (Note that, as you are watching the local variable, it will not be valid after the stack frame exits and will be automatically deleted after the foo returns). Also, if the watched variable changes on the current line to be executed, then the watch point is not getting triggered (not sure why). For me it works when I add the watch point watch thing.after just after entering foo when on line int* ip = &thing.after;. When I continue, the watch point hits 2 times.
You didn't say which platform, what version of GDB, or what command you used to set the watchpoint.
Using gdb 7.9 on Ubuntu/x86_64, things work as I expect them to work:
(gdb) b foo
Breakpoint 1 at 0x400538: file t.c, line 10.
(gdb) r
Starting program: /tmp/a.out
Breakpoint 1, foo (i=8) at t.c:10
10 int* ip = &thing.after;
(gdb) watch thing.after
Hardware watchpoint 2: thing.after
(gdb) c
Continuing.
Hardware watchpoint 2: thing.after
Old value = 4195712
New value = 12345
foo (i=8) at t.c:12
12 printf("beforehand\n");
(gdb) c
Continuing.
beforehand
Hardware watchpoint 2: thing.after
Old value = 12345
New value = 55
foo (i=8) at t.c:14
14 printf("done\n");
(gdb) q
I see some value in some place, but unsure where it has originated in my program. How do I figure out where this value initially comes from?
I expect the following event types to be logged:
A value originated from constant, arithmetical expression or syscall - the initial event;
The value was assigned to (or retrieved from) some variable;
The value was passed as an argument or returned from some function;
The value was stored to (or retrieved from) some struct;
By annotating source code with something specific, I triggered the history dump for this value.
For example, for this sample code:
#include <stdlib.h>
struct SomeStruct {
int a;
int b;
};
struct SomeStruct *globalvar;
int f1(struct SomeStruct* par) {
return par->a;
}
int f2(struct SomeStruct* par, int q) {
par->a = q;
return par->b;
}
void trace_value(int g) {} /* dummy */
int main(void) {
int f = 31337;
globalvar = malloc(sizeof(*globalvar));
f2(globalvar, f);
struct SomeStruct q = *globalvar;
int g = f1(&q);
trace_value(g);
return 0;
}
it should return something like
value 31337 originated from constant at fate.c:18
assigned to variable at fate.c:18
retrieved from variable at fate.c:21
passed as argument to function at fate.c:21
received as arument to a function at fate.c:12
assigned to struct field at fate.c:13
copied as a part of struct at fate.c:22
retrieved from struct field at fate.c:9
returned from function at fate.c:10
assigned to variable at fate.c:23
retrieved from variable at fate.c:25
traced at fate.c:25
How do I do this or something similar? I expect Valgrind or GDB or some combination should be able to do this.
Combined idea1 of using reverse gdb and idea2 from MarkPlotnick's comment of using gdb watchpoints. Here is the demo session, more complete than in original answer:
$ gcc -ggdb -Dtrace_value=exit fate.c -o fate
$ gdb -quiet -args ./fate
Reading symbols from /home/vi/code/_/fate...done.
(gdb) break main
Breakpoint 1 at 0x8048482: file fate.c, line 18.
(gdb) r
Starting program: /home/vi/code/_/fate
warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
Breakpoint 1, main () at fate.c:18
18 int f = 31337;
(gdb) record
(gdb) break 25
(gdb) # traced at fate.c:25
Breakpoint 2 at 0x80484d2: file fate.c, line 25.
(gdb) c
Continuing.
Breakpoint 2, main () at fate.c:25
25 trace_value(g);
(gdb) # retrieved from variable at fate.c:25
(gdb) watch g
Hardware watchpoint 3: g
(gdb) reverse-continue
Continuing.
Hardware watchpoint 3: g
Old value = 31337
New value = 134513899
0x080484ce in main () at fate.c:23
23 int g = f1(&q);
(gdb) # assigned to variable at fate.c:23
(gdb) # returned from function at fate.c:10
(gdb) reverse-step
f1 (par=0xffffd670) at fate.c:10
10 }
(gdb) list
5
6 struct SomeStruct *globalvar;
7
8 int f1(struct SomeStruct* par) {
9 return par->a;
10 }
11
12 int f2(struct SomeStruct* par, int q) {
13 par->a = q;
14 return par->b;
(gdb) # retrieved from struct field at fate.c:9
(gdb) print par
$3 = (struct SomeStruct *) 0xffffd670
(gdb) print ((struct SomeStruct *) 0xffffd670)->a
$4 = 31337
(gdb) watch ((struct SomeStruct *) 0xffffd670)->a
Hardware watchpoint 4: ((struct SomeStruct *) 0xffffd670)->a
(gdb) reverse-continue
Continuing.
Hardware watchpoint 4: ((struct SomeStruct *) 0xffffd670)->a
Old value = 31337
New value = -134716508
0x080484ba in main () at fate.c:22
22 struct SomeStruct q = *globalvar;
(gdb) # copied as a part of struct at fate.c:22
(gdb) print globalvar->a
$5 = 31337
(gdb) watch globalvar->a
Hardware watchpoint 5: globalvar->a
(gdb) reverse-continue
Continuing.
Hardware watchpoint 5: globalvar->a
Old value = 31337
New value = 0
0x0804846f in f2 (par=0x804a008, q=31337) at fate.c:13
13 par->a = q;
(gdb) # assigned to struct field at fate.c:13
(gdb) # received as arument to a function at fate.c:12
(gdb) list
8 int f1(struct SomeStruct* par) {
9 return par->a;
10 }
11
12 int f2(struct SomeStruct* par, int q) {
13 par->a = q;
14 return par->b;
15 }
16
17 int main() {
(gdb) bt
#0 0x0804846f in f2 (par=0x804a008, q=31337) at fate.c:13
#1 0x080484b0 in main () at fate.c:21
(gdb) reverse-finish
Run back to call of #0 0x0804846f in f2 (par=0x804a008, q=31337) at fate.c:13
0x080484ab in main () at fate.c:21
21 f2(globalvar, f);
(gdb) # passed as argument to function at fate.c:21
(gdb) # retrieved from variable at fate.c:21
(gdb) watch f
Hardware watchpoint 6: f
(gdb) reverse-finish
"finish" not meaningful in the outermost frame.
(gdb) reverse-continue
Continuing.
Warning:
Could not insert hardware watchpoint 6.
Could not insert hardware breakpoints:
You may have requested too many hardware breakpoints/watchpoints.
(gdb) delete
Delete all breakpoints? (y or n) y
(gdb) watch f
Hardware watchpoint 7: f
(gdb) reverse-continue
Continuing.
No more reverse-execution history.
main () at fate.c:18
18 int f = 31337;
(gdb) # assigned to variable at fate.c:18
(gdb) # value 31337 originated from constant at fate.c:18
All expected messages in the question statement correspond to some info you have seen in gdb output (as shown in comments).
I believe it could be accomplished manually (i.e. running on gdb session) in runtime by technique called "reverse debugging". I haven't tried it yet, but GDB version 7.0 documentation claims, that it is supported on some platforms.
The method would be something like:
localize single step where variable is used in the last place (that is, your "starting point")
analyze source code (so you need debugging symbol and code section available) of stack frame (e.g. by list), so you get how this value is obtained (or possibly modified) witihin (e.g. from parameter passed to function)
step back to previous stack frame and repeat from previous step unless you find its origin
Here is some proof-of-concept session for your sample code. I edited it a bit, as trace_value function was undefined. Note that record command may heavily slow down program's execution.
$ gdb -q a.out
Reading symbols from /home/grzegorz/workspace/a.out...done.
(gdb) b main
Breakpoint 1 at 0x400502: file fate.c, line 22.
(gdb) run
Starting program: /home/grzegorz/workspace/a.out
Breakpoint 1, main () at fate.c:22
22 int f = 31337;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6_6.5.x86_64
(gdb) record
(gdb) b trace_value
Breakpoint 2 at 0x4004f8: file fate.c, line 19.
(gdb) c
Continuing.
Breakpoint 2, trace_value (g=31337) at fate.c:19
19 void trace_value(int g){}
(gdb) info args
g = 31337
(gdb) reverse-finish
Run back to call of #0 trace_value (g=31337) at fate.c:19
0x0000000000400550 in main () at fate.c:29
29 trace_value(g);
(gdb) bt
#0 0x0000000000400550 in main () at fate.c:29
(gdb) list 29
24 globalvar = malloc(sizeof(*globalvar));
25 f2(globalvar, f);
26 struct SomeStruct q = *globalvar;
27 int g = f1(&q);
28
29 trace_value(g);
30
31 return 0;
32 }
Few things maybe require some explanation. You need to set breakpoint for main at first, as this is when the program execution begins, then enable session recording by record command. Then set second breakpoint at trace_value function and use continue command (c in short). This allows you to record whole execution up to moment when trace_value is entered. You may think of it as this "starting point", described above.
This of course not the full story. As I described earlier you need to analyze source code of current stack frame an then decide what to do next. You may use reverse-step or reverse-finish command accordingly to current situation.
While debugging a crash using GDB, I found the program crashes in ASSERT(). The odd thing is that the pointer contains 0x0 which points to valid data.
Sample code:
#define MAX_NUM 10;
...
...
assert(x->y != NULL);
assert(x->y->z < MAX_NUM); <-- Crashes here
I can see that 'x' points to a valid address. When I do:
(gdb) print x
$16 = 0x841eda3
(gdb) print x->y
$17 = 0x0
(gdb) print *x->y
$18 = {
...
...
z = 1;
...
}
How is this possible? Shouldn't I get "Cannot access memory at address 0x0" error from GDB?
What version of GDB?
Core dumps don't always tell the truth. They can be messed up in many ways. I've had core dumps that look valid and are not. I think you simply got lucky that it looks right.