How can I dump all the global variables and the address offsets in my executable?
This is on os x, app developed with xcode compiled with gcc.
Thank you
If Compiled to Mach-O
Either use otool or the cctools.
If Compiled to ELF
You should be able to do this with objdump and/or readelf.
I don't have a *nix system at hand here, but objdump -s -j .data should be getting you rather close enough.
An easy way to get a list of global variables and their offsets from an executable on macOS (original question) is to use the nm tool. Without any additional keys, it gives the variables and their offsets.
Example:
$ nm <some binary>
...
U __ZTVN10__cxxabiv117__class_type_infoE
U __ZTVN10__cxxabiv120__si_class_type_infoE
000000010006e248 s __ZTVN7testing17TestEventListenerE
000000010006e1c0 s __ZTVN7testing22EmptyTestEventListenerE
000000010006dfa8 s __ZTVN7testing31TestPartResultReporterInterfaceE
000000010006d860 S __ZTVN7testing32ScopedFakeTestPartResultReporterE
000000010006d8d8 S __ZTVN7testing4TestE
...
Check man nm for the explanation of the codes like U, s, S, etc.
In case you also want to look for string constants, there is another tool strings:
$ strings <some binary> -o -t x
will give you a list of string literals and their offsets.
See man strings for more details.
Related
I am looking for the unix command to display the header portion in Hex for any excutable that has been compiled by the cc compiler.
I had this once and now I cant remember it.
I just want to see what the compiler code that is at the start of any c programs that I compile
I am aware that I can use 'hexdump [filename]' however that doesnt isolate the header portion .
Hope i have explained myself well enough.....
The command readelf is available on most Linux systems and has the ability to display many parts of an ELF file. You can use readelf -H to get a short synopsis of the various options.
To get just the file header you can use readelf -h or readelf --fileheader to display the file header.
To see it in hex, you can use the command xxd. Given that the elf header is 64 bytes (on a 64-bit machine), you can use xxd -l 64
Objdump command in Linux is used to provide thorough information on object files. This command is mainly used by the programmers who work on compilers, but still its a very handy tool for normal programmers also when it comes to debugging. In this article, we will understand how to use objdump command through some examples.
Basic syntax of objdump is :
objdump [options] objfile...
There is a wide range of options available for this command.
For example, factorial is the c program that I have to compiled.
1.Display object format specific file header contents using
-p option
The following example prints the object file format specific information.
$ objdump -p factorial
Display the contents of all headers using -x option
Information related to all the headers in the object file can be retrieved using the -x option.
objdump -x factorial
i'm trying to link a simple c program on an arm debian machine (a raspberry pi) and when linking the ogject file the linker returns me the error in the subject.
my program is as simple as
simple.c:
int main(){
int a = 2;
int b = 3;
int c = a+b;
}
i compile it with
$>gcc -o simple.obj simple.c
and then link it with
$>ld -o simple.elf simple.obj
ld: simple.obj: access beyond end of merged section (33872)
i can't understand why...
if i try to read the elf file with objdump -d it doesn't manage to decompile the .text section (it only prints address, value, .word and again value preceded by 0x) but the binary data is the same as the one i get from the decompiled simple.obj.
the only difference is in the loading start (and consequent) addresses of the binary data: the elf file starts at 0x8280, the object file starts at 0x82a0.
what does all this mean?
EDIT:
this is the dump for the obj file: http://pastebin.com/YZ94kRk4
and this is the dump for the elf file: http://pastebin.com/3C3sWqrC
i tried compiling with -c option that makes gcc stop after assembly time (it already did the linking part) but now i have a different problem: it says that there is no _start section in my object file...
the new dumps are:
simple.obj: http://pastebin.com/t0TqmgPa
simple.elf: http://pastebin.com/qD35cnqw
You are misunderstanding the effect of the commands you ran. If you run:
$ gcc -o simple.obj simple.c
it already creates the program you want to run, it's already linked. You don't need to link it again, especially by running ld directly unless you know what you are doing. Even if its extension is obj, it doesn't matter, it's just the name of the file, but the content of the file is already a complete Linux program. So if you run:
$ ./simple.obj
it will execute your code.
You usually don't call ld directly, but instead you use gcc as a front-end to compile and link. This is because gcc takes care of linking also important libraries that you are not linking such as the startup code, and that's the reason why your second attempt resulted in "no _start section" or something like that.
Could you print the output of the objdump -d command?
Btw, notice that 33872 == 0x8450.
I am not familiar with raspberry PI's memory map, so if you'r following any tutorials about this or have some other resource to help me help you out - it would be great :)
I am building a project that builds multiple shared libraries and executable files. All the source files that are used to build these binaries are in a single /src directory. So it is not obvious to figure out which source files were used to build each of the binaries (there is many-to-many relation).
My goal is to write a script that would parse a set of C files for each binary and make sure that only the right functions are called from them.
One option seems to be to try to extract this information from Makefile. But this does not work well with generated files and headers (due to dependence on Includes).
Another option could be to simply browse call graphs, but this would get complicated, because a lot of functions are called by using function pointers.
Any other ideas?
You can first compile your project with debug information (gcc -g) and use objdump to get which source files were included.
objdump -W <some_compiled_binary>
Dwarf format should contain the information you are looking for.
<0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
< c> DW_AT_producer : (indirect string, offset: 0x5f): GNU C 4.4.3
<10> DW_AT_language : 1 (ANSI C)
<11> DW_AT_name : (indirect string, offset: 0x28): test_3.c
<15> DW_AT_comp_dir : (indirect string, offset: 0x36): /home/auselen/trials
<19> DW_AT_low_pc : 0x82f0
<1d> DW_AT_high_pc : 0x8408
<21> DW_AT_stmt_list : 0x0
In this example, I've compiled object file from test_3, and it was located in .../trials directory. Then of course you need to write some script around this to collect related source file names.
First you need to separate the debug symbols from the binary you just compiled. check this question on how to do so:
How to generate gcc debug symbol outside the build target?
Then you can try to parse this file on your own. I know how to do so for Visual Studio but as you are using GCC I won't be able to help you further.
Here is an idea, need to refine based on your specific build. Make a build, log it using script (for example script log.txt make clean all). The last (or one of the last) step should be the linking of object files. (Tip: look for cc -o <your_binary_name>). That line should link all .o files which should have corresponding .c files in your tree. Then grep those .c files for all the included header files.
If you have duplicate names in your .c files in your tree, then we'll need to look at the full path in the linker line or work from the Makefile.
What Mahmood suggests below should work too. If you have an image with symbols, strings <debug_image> | grep <full_path_of_src_directory> should give you a list of C files.
You can use unix nm tool. It shows all symbols that are defined in the object. So you need to:
Run nm on your binary and grab all undefined symbols
Run ldd on your binary to grab list of all its dynamic dependencies (.so files your binary is linked to)
Run nm on each .so file youf found in step 2.
That will give you the full list of dynamic symbols that your binary use.
Example:
nm -C --dynamic /bin/ls
....skipping.....
00000000006186d0 A _edata
0000000000618c70 A _end
U _exit
0000000000410e34 T _fini
0000000000401d88 T _init
U _obstack_begin
U _obstack_newchunk
U _setjmp
U abort
U acl_extended_file
U bindtextdomain
U calloc
U clock_gettime
U closedir
U dcgettext
U dirfd
All those symbols with capital "U" are used by ls command.
If your goal is to analyze C source files, you can do that by customizing the GCC compiler. You could use MELT for that purpose (MELT is a high-level domain specific language to extend GCC) -adding your own analyzing passes coded in MELT inside GCC-, but you should first learn about GCC middle-end internal representations (Gimple, Tree, ...).
Customizing GCC takes several days of work (mostly because GCC internals are quite complex in the details).
Feel free to ask me more about MELT.
I need a way to analyze output file of my GCC compiler for ARM. I am compiling for bare metal and I am quite concerned with size. I can use arm-none-eabi-objdump provided by the cross-compiler but parsing the output is not something I would be eager to do if there exists a tool for this task. Do you know of such a tool existing? My search turned out no results.
One more thing, every function in my own code is in its own section.
You can use nm and size to get the size of functions and ELF sections.
To get the size of the functions (and objects with static storage duration):
$ nm --print-size --size-sort --radix=d tst.o
The second column shows the size in decimal of function and objects.
To get the size of the sections:
$ size -A -d tst.o
The second column shows the size in decimal of the sections.
The readelf utility is handy for displaying a variety of section information, including section sizes, e.g.:
arm-none-eabi-readelf -e foo.o
If you're interested in the run-time memory footprint, you can ignore the sections that do not have the 'A' (allocate) flag set.
When re-visiting this question 10 years later one must mention the little Python-based wrapper for readelf and nm that is elf-size-analyze:
puncover uses objdump and a few other gcc tools to generate html pages you can easily browse to figure out where your code and data space is going.
It's a much nicer frontend than the text output of the gcc tools.
A little similar with Where are static variables stored (data segment or heap or BSS)?,but not the same one.
Now I get a other process's variable's address like:0x10fb90,where is this variable stored(data segment or heap or BSS), could i get the location just from the process's pid and the variable's address?
I am working on osx using obj-c and c.
You have 2 options.
1. Use objdump
Something like
objdump -x a.out | grep YOUR_VARIABLE_ADDRESS
2. Use gcc's map option to generate a map file
Compile something like this in gcc
$ gcc -o foo.exe -Wl,-Map,foo.map foo.c
and now
$ grep YOUR_VARIABLE_ADDRESS foo.map
Both these methods will show your variable's location, if at all the address you supplied exits.
PS: The link I've added for the map file shows an example map file generated by Visual Studio linkers, but the format is typically similar in most of the map file formats generated by various linkers