How to programmatically look up a symbol in a running application - c

I have an application which I'm trying to debug, however running it under gdb is producing different results, and it would be nice to have it output true symbol information when confronted with an address. for instance.
I have a method which is called periodically and I can determine the address of the call site. However, I'd like to print out the symbol information at run time for this address. I know I can run "nm" on the executable but that is outside of the application. I want to be able to do it from within the application itself.
I'm using GCC 4.7.2 on a linux platform.
(eddited to explain why I can't use gdb)

Dynamic symbol information can be accessed via the DT_DYNAMIC segment, which is loaded into memory and can be accessed by asking dlopen(3) for a handle to the main executable.
Static symbol information can be read only from the actual executable file, or an external file, as it is not listed in the loadable segments.
With just dynamic information, you will not be able to resolve anything that is not exported, which means you will most likely only see library calls unless your executable has its symbol table exported, so static information is probably the way to go.
This involves either lots of parsing, or using the bfd library built from binutils.
I'd seriously wonder if that was really worth the effort, though. You might get the same information from using the profiling support in gcc.

Related

Getting known library paths from ldconfig for use with dlopen

I have a program written in C that uses dlopen for loading plug-in modules. When the library is dynamically loaded, it runs constructor code which register pointer to structure with function implementations with the main application by use of exported function. I want to use absolute path for specifying the file to dlopen.
Then I have other part of the program with takes file, determine if it is ELF, then looks into the ELF header for specific ELF section, read this section and extract from it pertinent information. This way it filters only shared libraries which I have previously tagged as a plug-in module.
However, I am solving a problem how to discover them on the fly (in portable Linux way, i.e. it will run on Debian and on Fedora too and so on) from the main program. I have been thinking about using ldconfig for this. (As the modules will be installed by way of distro packaging system, APT for example.) Is there any way how to programmatically get the string list of known libraries from C program other than directly reading the /etc/ld.co.cache file? I was thinking that maybe there is some header library which will give char** when I ask.
Or, maybe is there any better solution to my problem?
(I am proponent of using standard system components that programming one-off solutions which will need support in the future.)

At dynamic linking, does the dynamic loader look at all object files for definitions, or only at those specified by the executable?

So I'm trying to wrap my head around static and dynamic linking. There are many resources on SO and on the web. I think I pretty much get it, but there's still one thing that seems to bother me. Also, please correct me if my overall understanding is wrong.
I think I understand static linking:
The linker unpacks the linked libraries, and actually includes the libraries' object files inside the produced executable. The unresolved-stubs in the application object files are then replaced by actual function-calling code, which calls functions in addresses known at build time.
Dynamic linking on the other hand is what puzzles me more: I understand that in dynamic linking, the stubs in the object-code which reference yet-unresolved names, are going to stay as stubs until runtime.
Then at runtime, the dynamic loader of the OS would look through precompiled libraries stored at standard filesystem locations. It would look in the object-files of the libraries, inside their symbol tables (?) and try to find a matching function definition for each unresolved-stub. It would then load the matching object-files into memory, and replace the stubs to point to the function definitions.
So the part I'm missing is this: where does the OS dynamic loader look - does it look in the symbol tables for all object-files in the system-libraries directory? Or does it only look in object-files specified somewhere in the application-executable file? Is this the reason why at compile time we must specify all dynamic dependencies of our program? Also, is it true dynamic libraries expose a symbol-table too?
So the part I'm missing is this: where does the OS dynamic loader look
- does it look in the symbol tables for all object-files in the system-libraries directory?
No dynamic linker I'm aware of does this.
Or does it only look in object-files
specified somewhere in the application-executable file?
Nor exactly this, either.
Details vary, but generally, a dynamic linker looks for specific shared libraries by name in various directories. The directories searched may be built into the linker, specified by the operating system, specified in the object being linked, or a combination. The linker does not (generally) examine libraries' symbol tables until after it locates them by name and selects them for linking.
Is this the
reason why at compile time we must specify all dynamic dependencies of
our program?
Yes, though under some circumstances we do not need to specify all dynamic dependencies at compile time. Some dynamic linkers support on-demand dynamic loading as directed by the program itself. This can be used to implement plugin systems, among other purposes.
Also, is it true dynamic libraries expose a symbol-table
too?
Yes. Dynamic libraries have their own symbol tables because
The dynamic linker uses them to do its work, and
Dynamic libraries can have their own dynamic linking requirements, which are not necessarily reflected in the main program's.
In the normal usage, "dynamic linking" is performed by the loader. "Static linking" is performed by the linker.
Generally, linkers can create either executable files or shared libraries. The linker output for both is an instruction stream that tells the loaders how to place the executable or library in memory.
Dynamic linking on the other hand is what puzzles me more: I understand that in dynamic linking, the stubs in the object-code which reference yet-unresolved names, are going to stay as stubs until runtime
That is not [usually] correct. The linker will locate the shared library in which the symbol exists. The executable will have an instruction to find the symbol in that shared library. Linkers generally puke if they cannot find all the symbols that need to be resolved.
So the part I'm missing is this: where does the OS dynamic loader look - does it look in the symbol tables for all object-files in the system-libraries directory?
This a system specific question. In well designed operating systems, the shared libraries are designated by the system manager. The loader uses the library specified by the system. Poorly designed systems frequently use some kind of search path to find the shared libraries (which created a massive security hole).

Building firmware Patch for embedded applications

I have a library stack that is not going to change, and an firmware that is going to use only this stack. Firmware will change alot along the way. I don't want to every time release the whole image(including library stack) because of limited memory and resources issue(This is an embedded application not a desktop or server).
I just want to release the application image and that automatically be able to use the library image. I am not sure how to do it. I know in Windows for example this is handled by dll's. But this is an embedded application and has no OS. Binary images loads to memory and processor is going to execute it.
Any experience/suggestions?
Toolchain: IAR 8051
This depends quite a bit on your tool-chain. Here's a possible high-view approach.
Compile your library into an executable image, setting your linker to use a particular portion of your flash memory space. You'll probably need a fake/stub entry function for the linker to be happy.
Once that is done, find all of the addresses of the symbols used by the library and instruct your linker as to those symbol locations when building your normal program, and do not instruct the link process to use the intermediary library objects when linking. Also instruct the linker to place the code into the section of flash that is update-able.
What you will then have is an image for the library, and the ability to build new versions of the main program image using at library.
This could probably be scripted if your linker output format is an unstripped elf (prior to converting to a binary for burning on the flash), and if your linker can accept a plain text file for instructions (both are true if you are using the gnu toolchains). I'd recommend scripting it for your sanity unless the library has very few externally visible functions and variables in it.
I do have to agree with some of the commentors; unless transferring the library is very hard, you should just build a single simple image that includes the library and push the whole thing. You might say the library will never change now, but inevitably something will come up that requires a change to the library code, and if you change the library and cannot keep the symbols in exactly the same spot, all of your application images will not be able to work with the new library. This is a recipe for a nightmare when dealing with compatible software (firmware) updates.

Creating ELF binaries without using libelf or other libraries

Recently I tried to write a simple compiler on the linux platform by myself.
When it comes to the backend of the compiler, I decided to generate ELF-formatted binaries without using a third-party library, such as libelf.
Instead I want to try to write machine code directly into the file coresponding to the ELF ABI just by using the write() function and controlling all details of the ELF file.
The advantage of this approach is that I can control everything for my compiler.
But I am hesitating. Is that way feasible, considering how detailed the ELF ABI is?
I hope for any suggestions and pointers to good available resources available.
How easy/feasible this is depends on what features you want to support. If you want to use dynamic linking, you have to deal with the symbol table, relocations, etc. And of course if you want to be able to link with existing libraries, even static ones, you'll have to support whatever they need. But if your goal is just to make standalone static ELF binaries, it's really very easy. All you need is a main ELF header (100% boilerplate) and 2 PT_LOAD program headers: one to load your program's code segment, the other to load its data segment. In theory they could be combined, but security-hardened kernels do not allow a given page to be both writable and executable, so it would be smart to separate them.
Some suggested reading:
http://www.linuxjournal.com/article/1059

Trace all calls from a program?

A program is installed on 2 computers. A certain library is working correctly in one computer but not working at all in the other. I wonder if a library is missing.
I'm using strace so I can see which libraries are being called by the program at runtime. All libraries mentioned by strace are correct but does strace also detect if one library calls another library or file ? Any way to detect this scenario ?
Yes strace will detect calls from loaded libraries.
If you want to trace library calls (not system ones), use ltrace
It sounds like you want to view your app's function call graph, i.e. which functions call one another, which library they live in, etc. If so, you may want to check out the callgrind tool (which is part of valgrind).
Here is an example that uses callgrind to profile some code.
Once you've used callgrind to generate profile data for your app, load it into Kcachegrind to visualize it. It's simple point-and-click: highlight function, see callers/callees, view the call graph, and so on. I've found it quite useful in similar circumstances.
To check for libraries missing, run ldd /full/path/to/program
For programs using Clang to compile, try get call trace of the programs with Xray. It heuristically instruments part of a program and has very low runtime overhead.

Resources