How do I emulate objdump --dwarf=decodedline in .bundle files? - c

I've been successfully using objdump --dwarf=decodedline to find the source location of each offset in a .so file on Linux.
Unfortunately on Mac-OS X. It seems that .bundle files (used as shared libraries) are not queriable in this manner.
I'm optimistic that there's something I can do, because gdb is able to correctly debug and step through code in these bundles — does anyone know what it's doing?
Further information:
The dwarfdump utility claims that the .bundle file contains no DWARF data, but that it does contain STABS data; however objdump --stabs cannot find any stabs data either.
(If it makes the question easier to answer, I don't actually need all of the offsets; being able to query the source location of any given offset would be good enough).
The bundle file I've been testing this on was generated using:
cc -dynamic -bundle -undefined suppress -flat_namespace -g -o c_location.bundle c_location.o -L. -L/Users/User/.rvm/rubies/ruby-1.8.7-p357/lib -L. -lruby -ldl -lobjc
The original c_location.o file does contain the necessary information for objdump --dwarf=decodedline to work.

So it turns out that one way to do this is to use Apple's nm -pa *.bundle to find the symbol name and the original object file for a given offset.
Once you have that, you can first use objdump -tT to find the offset of the symbol name in the original object file; and then use objdump --dwarf=decodedline as before.
Each step requires a little bit of simplistic output parsing, but it does seem to work™. I'd be interested if there are more robust approaches.

Related

OCaml: Issues linking C and OCaml

I am able to wrap C code and access it from the OCaml interpreter, but cannot build a binary! I'm 98% sure it is some linking problem, but can't find the tools to explore the linkage.
Getting even to this point was a chore, (endless quantities of Error: The external function is not available messages) so I'll document everything I did.
A 'system' file stuff.c
#include <stdio.h>
int fun(int z) // Emulate a "real" subroutine
{
printf("duuude whoa z=%d\n", z);
return 42;
}
Compile above as
cc -fPIC -c stuff.c
ld -shared -o libstuff.so stuff.o
An OCaml wrapper around above, in ocmstuff.c:
#include <caml/mlvalues.h>
CAMLprim value yofun(value z) {
return Val_long(fun(Long_val(z)));
}
Build above as
cc -fPIC -c ocmstuff.c
ld -shared -o dllostuff.so ocmstuff.o -L . -lstuff -lc -rpath .
Yes, the rpath really is needed, else the next steps suffer. (Edit: If you don't use rpath, you'll need to use LD_LIBRARY_PATH=. instead. For the final 'production' version, you'd change the rpath to the actual library path, or do ld.so.conf trickery or install into 'standard' locations, or tell your users about LD_LIBRARY_PATH. This is just like what you'd do for any other system. The rpath solution seems to be the most stable and reliable solution.)
Next, a module declaration, stored in fapi.mli
module Fapi : sig
external ofun : int -> int = "yofun" ;;
end
Build above as:
ocamlc -a -o fapi.cma -intf fapi.mli -dllib -lostuff
Does it work? Yes it does:
$ rlwrap ocaml fapi.cma
OCaml version 4.11.1
open Fapi ;;
Fapi.ofun 33 ;;
duuude whoa z=33
- : int = 42
#
So the wrapper works fine. Now lets compile with it. Here's myprog.ml:
open Fapi ;;
Fapi.ofun 33 ;;
Compile it:
ocamlc -c myprog.ml
ocamlc -o myprog myprog.cmo fapi.cma
The very last command spews:
File "_none_", line 1:
Error: Required module `Fapi' is unavailable
I am 98% sure the above error is due to some silly linking error, but I cannot track it down. Why do I think this? Well, here's a related problem that provides a hint.
$ rlwrap ocaml
open Fapi ;;
# Fapi.ofun 33 ;;
Error: The external function `yofun' is not available
#
Well, that's odd. It clearly must have found fapi.cma because that is the only way it can know about yofun. But somehow, it doesn't know it needs to dig into dllostuff.so for that. Or possibly dllostuff.so is failing to correctly link/load libstuff.so ? Or maybe libc.so to get printf ? I'm pretty sure its one of these last few, but I just can't get it to work, and don't have the tools to debug it. (nm and ldd -r look healthy. Are there some similar tools for the assorted cma,cmo,cmi,cmx files?)
Interfacing with C is much easier if you use dune. You don't need to know the low-level details it is all handled for you.
Now, back to your example. This is definitely not how OCaml users are interfacing with C, but if you really want to learn about it here are a few notes.
The reason why you have the error is that:
you specified modules in an incorrect order, it should be topological, not reverse topological order, i.e., the dependency comes before dependent
you do not have the .ml file (the -intf option means very different)
The reason why the last snippet doesn't work is because you're not loading the library. The ocaml binary obviously doesn't have any fapi units linked into it, so you have to explicitly load it using either #load directive or by passing it in the command line.
Also the following line is not necessary,
ld -shared -o dllostuff.so ocstuff.o -L . -lstuff -lc -rpath .
First of all, there is no need to link a stub file into a shared library. It is counterproductive and doesn't really bring you anything. Second, passing -rpath . will render the end executable unusable, unless the shared objects are stored in the same folder as the executable. Just remove this.
Just to complete your exercise, here is how it could be built and run. First, let's fix the stub file. We need the ml file and we also need to remove an extra module definition,
$ cat fapi.{ml,mli}
external ofun : int -> int = "yofun" ;;
external ofun : int -> int = "yofun" ;;
Yes, they are the same. The mli file is not really needed here, but let's keep it for the sake of completeness.
The way how you build the pure C part is fine, as long as you get a relocatable .so file it works.
Now to build the ocstuff.c (which we conventionally call stubs) you just need to do,
ocamlc -c ocstuff.c
Don't turn it into a shared library, don't do anything else with it. Now let's build the fapi library,
ocamlc -c fapi.mli
ocamlc -c fapi.ml
Now let's build the library that contains both OCaml and C code,
ocamlmklib -o fapi fapi.cmo ocstuff.o -lstuff -L.
Now we can finally build the executable,
ocamlc -c myprog.ml
LD_LIBRARY_PATH=. ocamlc -o myprog fapi.cma myprog.cmo
and run it,
LD_LIBRARY_PATH=. ./myprog
duuude whoa z=33
Notice that we have to use the LD_LIBRARY_PATH to tell the system dynamic loader where to look for the external dependency libstuff.so. You can, of course, use rpath to specify its location (pass it to ocamlmklib via -ccopt) but in general it is assumed that the external library is installed at some location that the system loader knows.
Again, unless you're developing your own build system, please use dune or oasis for building OCaml programs. These systems will handle all low-level details in the best possible way.
P.S. It is also worth mentioning that you're not building a binary, but a bytecode executable. For binaries, you will have to use the ocamlopt compiler. And this would be a completely different story. Again, dune is the solution.
There is a lot to take in here, but these lines are suspicious:
ocamlc -c myprog.ml
ocamlc -o myprog myprog.cmo fapi.cma
OCaml expects modules in topologically sorted order, with a module appearing on the command line before the modules that refer to it.
So it would seem the last line should be this:
ocamlc -o myprog fapi.cma myprog.cmo
I hope this helps, it's just a quick response.
The answer provided by ivg works. It also provides enough hints to retrofit the original question to get the correct behavior. The changes to the original recipe are:
Create fapi.mli and fapi.ml which both have the same content: external ofun : int -> int = "yofun" ;;
Compile both the above with ocaml -c. The mli must be compiled first: it yields an interface file cmi which is needed before the ml file can be compiled into it's object file cmo.
The name dllostuff.so was wrong: it must be dllfapi.so to maintain naming consistency.
Build the cma archive/library as ocamlc -a -o fapi.cma fapi.cmo -dllib -lfapi
That's it! Other than these, the original instructions work. The answer from ivg suggests using
ocamlmklib -o fapi fapi.cmo ostuff.o -L. -lstuff
instead of
ld -shared -o dllfapi.so ostuff.o -L. -lstuff
Either of these work. The primary difference is that ocamlmklib also creates a static-linked library libfapi.a. Other than that, it creates the dllfapi.so as before. (That version also contains a motley assortment of typical gcc symbols, for handling exceptions, library ctors, etc. It's not clear why these are needed here, since they'll show up sooner or later anyway.)

How to dump path to source from object-file

Assume I have a C object-file app.o compiled with gcc. How can I dump the file path to the original app.c from which app.o was compiled. My goal is to create a listing of all symbols + respective source file path using the binutils and gcc toolsuite.
By no means am I expecting an all-in-one solution. So I tried playing with multiple tools to gather the information I need.
Inspecting the object-file with a text-editor reveals that (appart from a lot of unreadable binary gibberish) the file does contain a reference to app.c as a string embedded into the object-file format. However I did not find a way to extract that string using objdump or nm.
I was hoping objdump would have some flag that could extract this source file string, but after trying virtually all options documented in the man page I still couldn't find it.
With the path of the source file I was hoping I could run gcc -M <path-to-source>. This would allow me to look through all the headers included by app.c and find the in-source declarations.
Suppose a simple app.c like this:
void foo(void) {
}
Compile it via gcc -c app.c -o app.o.
Running objdump -t app.o dumps the symbol table, but does not refer anywhere to the original app.c.
Running cat app.o does show that the object-file contains the file path to app.c (relative to pwd at compile-time). But I wasn't exactly planning on writing my own object-file parser just to get to that string.
To answer my own question minutes after posting it (duh!):
readelf -s app.o prints a symbol table including the name of the source file (app.c). With that I am able to run gcc -M app.c and then parse through all header files to gather the symbol declarations.

Difference using avr-gcc and avr-ld in ELF (Executable and Linkable Format) file generation

Sorry if this might be off-topic.
In the process of generating .hex(Intel HEX format) files using avr-gcc or avr-ld the output(final result) is significantly different. As an minimal clarification I am talking about the step of generating ELF file just after generating the Object files.
On my first attempt, I used avr-ld to generate my ELF file. Process works smoothly but after generating HEX files and uploading to my board it did nothing (as in uploading an blank HEX file).
On my second try, I followed the advice found here:
It is important to specify the MCU type when linking. The compiler uses the -mmcu option to choose start-up files and run-time libraries that get linked together. If this option isn't specified, the compiler defaults to the 8515 processor environment, which is most certainly what you didn't want.
It did as I expected. Uploaded the HEX file and my board updated accordingly.
So my questions are as follows:
Why did the linker (avr-ld) lose information about the micro-controller I am using. I thought that the MCU information is stored in the Object files.
What is the logic behind this configuration? Is my way of thinking wrong (in using avr-gcc for compilation/generating .o files, avr-ld to link the .o files and generate the EFL files, and avr-objcopy to strip only usefull information and changing the format of the file ELF -> HEX)?
Is any way in achieving the same output using avr-ld as when using avr-gcc for generating my ELF file?
Why did the linker (avr-ld) lose information about the micro-controller I am using. I thought that the MCU information is stored in the Object files.
The linker doesn't llose that information, it was never supplied in the first place. Object files resp. ELF headers are on level of "emulation", i.e. granularity like -mmcu=arch where arch is one of avr2, avr25, avrxmega2, avrtiny etc.
using avr-gcc for compilation/generating .o files, avr-ld to link the .o files and generate the ELF files, and avr-objcopy to strip only usefull information and changing the format of the file ELF → HEX?
avr-gcc is not a compiler, it's just a driver program that calls other programs like compiler proper cc1 or cc1plus, assembler as, linker ld depending on file type and options provided. The driver will add options to these programs which greatly simplifies their usage, many of which are described in specs-attiny25 (since v5 onwards).
As an example, take a simple main.c with a main function returning 0, and process it with
avr-gcc main.c -o main.elf -mmcu=attiny25 -save-temps -v -Wl,-v
The -v makes the driver show the commands it is issuing, and -save-temps stores intermediate files like assembly. For avr-gcc v8.5, the link process starts with a call of collect2:
.../bin/../libexec/gcc/avr/8.5.0/collect2 -plugin .../bin/../libexec/gcc/avr/8.5.0/liblto_plugin.so -plugin-opt=.../bin/../libexec/gcc/avr/8.5.0/lto-wrapper -plugin-opt=-fresolution=main.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lm -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lattiny25 -mavr25 -o main.elf .../bin/../lib/gcc/avr/8.5.0/../../../../avr/lib/avr25/tiny-stack/crtattiny25.o -L.../bin/../lib/gcc/avr/8.5.0/avr25/tiny-stack -L.../bin/../lib/gcc/avr/8.5.0/../../../../avr/lib/avr25/tiny-stack -L.../bin/../lib/gcc/avr/8.5.0 -L.../bin/../lib/gcc -L.../bin/../lib/gcc/avr/8.5.0/../../../../avr/lib main.o -v --start-group -lgcc -lm -lc -lattiny25 --end-group
where ... stands for the absolute path where the tools are installed. As you can see, the driver adds some salt, for example it links against startup code crtattiny25.o, standard libs like libc.a, libm.a, libgcc.a. collect2 gathers some extra information needed to build startup code, and then calls back the compiler1 and finally ld.
The options provided to ld look very much like the ones provided to collect2. The only device-specific stuff is: startup code crtattiny25.o and device lib libattiny25.a. Many other device-specific stuff has already been compiler into the code, like SFR addresses, #ifdef __AVR_ATtiny25__ etc.
Is any way in achieving the same output using avr-ld as when using avr-gcc for generating my ELF file?
You could provide all that options by hand.
1Calling back the compiler is needed for LTO (link-time optimization) as of -flto. The linker calls a plugin which calls the compiler with LTO byte-code, compiles it to assemblwith the LTO-compiler lto1, then as, then ld. Newer versions of the tool are always using linker plugin when without lto compilation; one can -fno-use-linker-plugin which makes the call chain and options somewhat simpler.

Easiest way to merge 2 or more ELF files

I'm working on some embedded code for a class project that currently (per requirements) creates a number of srec files and merges them. I'd like to be able to load this code into QEMU, but it is generally only happy with ELF files. What is the esiest way to merge the original ELF files instead of the srecs.
Also acceptable, a method to convert the srec back into an ELF and have the resulting file be loadable (objcopy seems to produce fairly broken files doing this (no architecture amoung others).
The tools must be capable of working with m68k binaries, but the host system is plain x86.
Please look at ELFIO library. It contains WriteObj and Writer examples. By using the library, you will be able to create ELF binary files on different host platforms.
Easy...
let's assume : a.c and b.c
gcc a.c -c -o a.o
gcc b.c -c -o b.o
ld -r a.o b.oc -o c.o
c.o now containes both a.o and b.o as a relocatable ELF file.
--Ivan
Use your linker perhaps?
An srec file contains only the linked/located loadable binary, an elf file contains additional meta-data that is lost in the generation of the binary, so reversing back to elf may not be possible, especially if the elf needs to be relocatable.
I found the easiest solution to my original problem was actually to add SREC loading to qemu. I am already modifying the source to add board support, so SREC support isn't much additional work. I found some code on github from someone who had already done so and used it as a basis for my work.
https://github.com/MegabytePhreak/qemu-mcf5307/commit/d3bceb911893b37b2524d6e804bac96691d4d33c

How does objdump manage to display source code with the -S option?

Is there a reference to the source file in the binary? I tried running strings on the binary and couldn't find any reference to the source file listed...
objdump uses the DWARF debugging information compiled into the binary, which references the source file name. objdump tries to open the named source file to load the source and display it in the output. If the binary isn't compiled with debugging information, or objdump can't find the source file, then you don't get source code in your output - only assembly.
You don't see the source file name when you use strings on the binary, because DWARF uses compression.
The dwarf information in a binary stores the mapping between instructions(the instruction pointer or IP actually) and the source file and line number. The source file is specified using the complete path so it can be found even if the binary is moved around. To see this information you can use objdump --dwarf=decodedline <binary> (the binary ofcourse has to be compiled with -g).
Once you say objdump -S <binary> it use this dwarf info to give you source code along with the disassembly.
My understanding is that for objdump to display source code from the binary code, there is a precondition: the DWARF debugging information must be compiled into the binary. (by gcc -g sourcefile or gcc -gdwarf-2 sourcefile)
And by processing this DWARF information objdump is able to get the source code as #vlcekmi3 and #vkrnt answered

Resources