gdb in C: Get type of variable as string - c

I want to see if it is possible to get the type of a variable in gdb as a string. For example, if
int i = 1;
MyStruct *ms = NULL;
then I want to get something like
(gdb) <the-command-I-am-looking-for> i $local_var_i
(gdb) p $local_var_i
$local_var_i = "int"
(gdb) <the-command-I-am-looking-for> ms $local_var_ms
(gdb) p $local_var_ms
$local_var_ms = "MyStruct *"
I may have allocated the local variables before the above code segment, and the command may be a custom command.
Is such a thing possible? How could I achieve this?
Edit for clarification:
I have a group of functions in my program that change name according to the type they serve (I know it's not remotely the best way to do this, but I cannot change that). I want to write a gdb function which I can feed with just the variable and the rest will be done automatically, without my intevention. Preferably, I would also like to avoid a wall of if/else if.

If your gdb supports the python scripting extensions, you can try it like this:
define type2var
eval "python gdb.execute(\"set $%s=\\\"\"+(gdb.execute(\"ptype %s\", False, True)).strip()+\"\\\"\")",$arg1,$arg0
end
Now you can call it like this:
>>> type2var "variable" "typevar"
>>> print $typevar
$1 = "type = char [32]"
You can of course format it further as needed using python string functions.

Related

How to 'tag' a location in a C source file for a later breakpoint definition?

Problem:
I want to be able to put different potentially unique or repeated "tags" across my C code, such that I can use them in gdb to create breakpoints.
Similar Work:
Breakpoints to line-numbers: The main difference with breakpoints on source lines, is that if the code previous to the tag is modified in such a way that it results in more or less lines, a reference to the tag would still be semantically correct, a reference to the source line would not.
Labels: I am coming from my previous question, How to tell gcc to keep my unused labels?, in which I preconceived the idea that the answer was to insert labels. Upon discussion with knowledgeable members of the platform, I was taught that label's names are not preserved after compilation. Labels not used within C are removed by the compiler.
Injecting asm labels: Related to the previous approach, if I inject asm code in the C source, certain problems arise, due to inline functions, compiler optimizations, and lack of scoping. This makes this approach not robust.
Define a dummy function: On this other question, Set GDB breakpoint in C file, there is an interesting approach, in which a "dummy" function can be placed in the code, and then add a breakpoint to the function call. The problem with this approach is that the definition of such function must be replicated for each different tag.
Is there a better solution to accomplish this? Or a different angle to attack the presented problem?
Using SDT (Statically Defined Tracing) probe points appears to satisfy all the requirements.
GDB documentation links to examples of how to define the probes.
Example use: (gdb) break -probe-stap my_probe (this should be documented in the GDB breakpoints section, but currently isn't).
You could create a dummy variable and set it to different values. Then you can use conditional watchpoints. Example:
#include <stdio.h>
static volatile int loc;
int main()
{
loc = 1;
puts("hello world");
loc = 2;
return 0;
}
(gdb) watch loc if loc == 2
Hardware watchpoint 1: loc
(gdb) r
Starting program: /tmp/a.out
hello world
Hardware watchpoint 1: loc
Old value = 1
New value = 2
main () at test.c:8
8 return 0;
You can of course wrap the assignment in a macro so you only get it in debug builds. Usual caveats apply: optimizations and inlining may be affected.
Use python to search a source file for some predefined labels, and set breakpoints there:
def break_on_labels(source, label):
"""add breakpoint on each SOURCE line containing LABEL"""
with open(source) as file:
l = 0
for line in file:
l = l + 1
if label in line:
gdb.Breakpoint(source=source, line=l)
main_file = gdb.lookup_global_symbol("main").symtab.fullname()
break_on_labels(main_file, "BREAK-HERE")
Example:
int main(void)
{
int a = 15;
a = a + 23; // BREAK-HERE
return a;
}
You could insert a #warning at each line where you want a breakpoint, then have a script to parse the file and line numbers from the compiler messages and write a .gdbinit file placing breakpoints at those locations.

Search and replace a string as shown below

I am reading a file say x.c and I have to find for the string "shared". Once the string like that has been found, the following has to be done.
Example:
shared(x,n)
Output has to be
*var = &x;
*var1 = &n;
Pointers can be of any name. Output has to be written to a different file. How to do this?
I'm developing a source to source compiler for concurrent platforms using lex and yacc. This can be a routine written in C or if u can using lex and yacc. Can anyone please help?
Thanks.
If, as you state, the arguments can only be variables and not any kind of other expressions, then there are a couple of simple solutions.
One is to use regular expressions, and do a simple search/replace on the whole file using a pretty simple regular expression.
Another is to simply load the entire source file into memory, search using strstr for "shared(", and use e.g. strtok to get the arguments. Copy everything else verbatim to the destination.
Take advantage of the C preprocessor.
Put this at the top of the file
#define shared(x,n) { *var = &(x); *var1 = &(n); }
and run in through cpp. This will include external resources also and replace all macros, but you can simply remove all #something lines from the code, convert using injected preprocessor rules and then re-add them.
By the way, why not a simple macro set in a header file for the developer to include?
A doubt: where do var and var1 come from?
EDIT: corrected as shown by johnchen902
When it comes to preprocessor, I'll do this:
#define shared(x,n) (*var=&(x),*var1=&(n))
Why I think it's better than esseks's answer?
Suppose this situation:
if( someBool )
shared(x,n);
else { /* something else */ }
In esseks's answer it will becomes to:
if( someBool )
{ *var = &x; *var1 = &n; }; // compile error
else { /* something else */ }
And in my answer it will becomes to:
if( someBool )
(*var=&(x),*var1=&(n)); // good!
else { /* something else */ }

How do I get the base type of a variable using GDB/MI

Using the GDB machine interface, is there a way to get the base type for a specific variable? For example, if I have a variable whose type is a uint32_t (from types.h) is there a way to get GDB to tell me that either that variable's basic type is an unsigned long int, or alternatively, that uint32_t is typedef'ed to an unsigned long int?
You can use "whatis" command
suppose you have
typedef unsigned char BYTE;
BYTE var;
(gdb)whatis var
type = BYTE
(gdb)whatis BYTE
BYTE = unsigned char
I know very little about gdb/mi; The following hacks use python to sidestep MI while being callable from the MI '-interpreter-exec' command. Probably not what you were imagining.
I didn't see anything obvious in the MI documentation -var-info-type doesn't seem to do what you want, and this is similar to bug 8143 (or i should say possible if bug 8143 were implemented):
http://sourceware.org/bugzilla/show_bug.cgi?id=8143
Part 1: implement a command that does what you want in python.
# TODO figure out how to do this without parsing the the normal gdb type = output
class basetype (gdb.Command):
"""prints the base type of expr"""
def __init__ (self):
super (basetype, self).__init__ ("basetype", gdb.COMMAND_OBSCURE);
def call_recursively_until_arg_eq_ret(self, arg):
x = arg.replace('type = ', "")
x = gdb.execute("whatis " + x, to_string=True)
if arg != x:
x = self.call_recursively_until_arg_eq_ret(x).replace('type = ', "")
return x
def invoke (self, arg, from_tty):
gdb.execute("ptype " + self.call_recursively_until_arg_eq_ret('type = ' + arg).replace('type = ', ""))
basetype ()
Part 2: execute it using the console interpreter
source ~/git/misc-gdb-stuff/misc_gdb/base_type.py
&"source ~/git/misc-gdb-stuff/misc_gdb/base_type.py\n"
^done
-interpreter-exec console "basetype y"
~"type = union foo_t {\n"
~" int foo;\n"
~" char *y;\n"
~"}\n"
^done
-interpreter-exec console "whatis y"
~"type = foo\n"
^done
Part 3:
Notice the limitations of part 2 all of your output is going to the stdout stream. If that is unacceptable you could hook open up a 2nd output channel from gdb for use with your interface and write to it with python. using something like twisted matrix, or a file.
here is an example using twisted matrix, you'd just need to switch it to direct the 'basetype' output where you want.
https://gitorious.org/misc-gdb-stuff/misc-gdb-stuff/blobs/master/misc_gdb/twisted_gdb.py
otherwise you could parse the stdout stream i suppose, either way its hacks on hacks.
hope that helps.

Locate unused structures and structure-members

Some time ago we took over the responsibility of a legacy code base.
One of the quirks of this very badly structured/written code was that
it contained a number of really huge structs, each containing
hundreds of members. One of the many steps that we did was to clean
out as much of the code as possible that wasn't used, hence the need
to find unused structs/struct members.
Regarding the structs, I conjured up a combination of python, GNU
Global and ctags to list the struct members that are unused.
Basically, what I'm doing is to use ctags to generate a tags file,
the python-script below parses that file to locate all struct
members and then using GNU Global to do a lookup in the previously
generated global-database to see if that member is used in the code.
This approach have a number of quite serious flaws, but it sort of
solved the issue we faced and gave us a good start for further
cleanup.
There must be a better way to do this!
The question is: How to find unused structures and structure members
in a code base?
#!/usr/bin/env python
import os
import string
import sys
import operator
def printheader(word):
"""generate a nice header string"""
print "\n%s\n%s" % (word, "-" * len(word))
class StructFreqAnalysis:
""" add description"""
def __init__(self):
self.path2hfile=''
self.name=''
self.id=''
self.members=[]
def show(self):
print 'path2hfile:',self.path2hfile
print 'name:',self.name
print 'members:',self.members
print
def sort(self):
return sorted(self.members, key=operator.itemgetter(1))
def prettyprint(self):
'''display a sorted list'''
print 'struct:',self.name
print 'path:',self.path2hfile
for i in self.sort():
print ' ',i[0],':',i[1]
print
f=open('tags','r')
x={} # struct_name -> class
y={} # internal tags id -> class
for i in f:
i=i.strip()
if 'typeref:struct:' in i:
line=i.split()
x[line[0]]=StructFreqAnalysis()
x[line[0]].name=line[0]
x[line[0]].path2hfile=line[1]
for j in line:
if 'typeref' in j:
s=j.split(':')
x[line[0]].id=s[-1]
y[s[-1]]=x[line[0]]
f.seek(0)
for i in f:
i=i.strip()
if 'struct:' in i:
items=i.split()
name=items[0]
id=items[-1].split(':')[-1]
if id:
if id in y:
key=y[id]
key.members.append([name,0])
f.close()
# do frequency count
for k,v in x.iteritems():
for i in v.members:
cmd='global -a -s %s'%i[0] # -a absolute path. use global to give src-file for member
g=os.popen(cmd)
for gout in g:
if '.c' in gout:
gout=gout.strip()
f=open(gout,'r')
for line in f:
if '->'+i[0] in line or '.'+i[0] in line:
i[1]=i[1]+1
f.close()
printheader('All structures')
for k,v in x.iteritems():
v.prettyprint()
#show which structs that can be removed
printheader('These structs could perhaps be removed')
for k,v in x.iteritems():
if len(v.members)==0:
v.show()
printheader('Total number of probably unused members')
cnt=0
for k,v in x.iteritems():
for i in v.members:
if i[1]==0:
cnt=cnt+1
print cnt
Edit
As proposed by #Jens-Gustedt using the compiler is a good way to do it. I'm after a approach that can do a sort of "High Level" filtering before using the compiler-approach.
If these are only a few struct and if the code does no bad hacks of accessing a struct through another type... then you could just comment out all the fields of your first struct and let the compiler tell you.
Uncomment one used field after the other until the compiler is satisfied. Then once that compiles, to a good testing to ensure the precondition that there were no hacks.
Iterate over all struct.
Definitively not pretty, but at the end you'd have at least one person who knows the code a bit.
Use coverity. This is a wonderful tool to detect code flaws, but is a bit costly.
Although it is a very old post. But recently I did the same using python and gdb. I compiled following snippet of code with structure at the top of hierarchy and then using gdb did print type on the structure and re-cursed into its members.
#include <usedheader.h>
UsedStructureInTop *to_print = 0;
int main(){return 0;}
(gdb) p to_print
(gdb) $1 = (UsedStructureInTop *) 0x0
(gdb) pt UsedStructureInTop
type = struct StructureTag {
members displayed here line by line
}
(gdb)
Although my purpose is little different. It is to generate a header that contains only the structure UsedStructureInTop and its dependency types. There are compiler options to do this. But they do not remove unused/unlinked structures found in the included header files.
Under C rules, it's possible to access struct members via another structure which has a similar layout. That means that you can access struct Foo {int a; float b; char c; }; via struct Bar { int x; float y; }; (except of course for Foo::c).
Hence, your algorithm is potentially flawed. It's bloody hard to find what you want, which BTW is why C is hard to optimize.

I have a function with a lot of return points. Is there any way that I can make gdb show me which one is returning?

I have a function with an absurd number of return points, and I don't want to caveman each one, and I don't want to next through the function. Is there any way I can do something like finish, except have it stop on the return statement?
You can try reverse debugging to find out where function actually returns. Finish executing current frame, do reverse-step and then you should stop at just returned statement.
(gdb) fin
(gdb) reverse-step
There is already similar question
I think you're stuck setting breakpoints. I'd write a script to generate the list of breakpoint commands to run and paste them into gdb.
Sample script (in Python):
lines = open(filename, 'r').readlines()
break_lines = [line_num for line_num, line in enumerate(lines) if 'return' in line and
line_num > first and line_num <= last]
break_cmds = ['b %s:%d' % (filename, line_num) for line_num in break_lines]
print '\n'.join(break_cmds)
Set filename to the name of the file with the absurd function, first to the first line of the function (this is a quick script, not a C parser) and last to the number of the last line of the function. The output ought to be suitable for pasting into gdb.
Kind of a stretch, but the catch command can stop on many kinds of things (like forking, exiting, receiving a signal). You may be able to use catch catch (which breaks for exceptions) to do what you want in C++ if you wrap the function in try/finally. For that matter, if you break on a line inside the finally you can probably single-step through the return after that (although how much that will tell you about where it came from is highly dependent on optimization: common return cases are often folded by gcc).
How about taking this opportunity to break up what seems to be clearly a too-large function?
This question's come up before on SO. My answer from there:
Obviously you ought to refactor this function, but in C++ you can use this simple expedient to deal with this in five minutes:
class ReturnMarker
{
public:
ReturnMarker() {};
~ReturnMarker()
{
dummy += 1; //<-- put your breakpoint here
}
static int dummy;
}
int ReturnMarker::dummy = 0;
and then instance a single ReturnMarker at the top of your function. When it returns, that instance will go out of scope, and you'll hit the destructor.
void LongFunction()
{
ReturnMarker foo;
// ...
}

Resources