I'm trying to locate the .c files that are related to the #include header files in avr.
I want to have a look at some of the standard libraries that are defined in the avr-gcc library, particularly the PORT definitions contained in <avr/io.h>. I searched through the library in /usr/lib/avr/include/avr and found the header file, however what I am looking for is the .c file. Does this file exist? If so, where can I find it? If not, what is the header file referencing?
The compiler provided libraries are precompiled object code stored in static libraries. In gcc, libraries conventionally the extension .a (for "archive" for largely historic reasons), and the prefix "lib".
At build time, the linker will search the library archives to find the object-code modules necessary to resolve referenced to library symbols. It extracts the required modules and links them to the binary image being built.
In gcc a library libXXX.a is typically linked using the command line switch -lXXX - so the libXXX.a naming convention is important in that case. So for example the standard C library libc.a is looking linked by the switch -lc.
So to answer your question, there are normally no .c files for the compiler provided libraries provided with the toolchain. The library need not even have been written by in C.
That said, being open source, the source files (.c or otherwise) will be available from the repositories of the various libraries. For example, for the standard C library: https://www.nongnu.org/avr-libc/.
For other AVR architecture and I/O support libraries, you might inspect the associated header files or documentation. The header files will typically have a boiler-plate comment with a project URL for example.
PORTB and other special function registers are usually defined as macros in headers provided by avr-libc. Find your include/avr directory (the one that contains io.h). In that directory, there should be many other header files. As an example, iom328p.h contains the following line that defines PORTB on the ATmega328P:
#define PORTB _SFR_IO8(0x05)
If you are also looking for the libraries that are distributed as .a files, you should run avr-gcc -print-search-dirs.
There are several ways to find out where the system headers are located and which are included:
avr-gcc -v -mmcu=atmega8 foo.c ...
With option -v, GCC will print (amongst other stuff) whch include paths it is using. Check the output on a shell / console, where GCC will print the search paths:
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/avr/5.4.0/include
/usr/lib/gcc/avr/5.4.0/include-fixed
/usr/lib/gcc/avr/5.4.0/../../../avr/include
The last location is for AVR-LibC, which provides avr/io.h. Resolving the ..s, that path is just /usr/lib/avr/include. These paths depend on how avr-gcc was configured and installed, hence you have to run that command with your installation of avr-gcc.
avr-gcc -H -mmcu=atmega8 foo.c ...
Suppose the C-file foo.c reads:
#include <avr/io.h>
int main (void)
{
PORTD = 0;
}
for an easy example. With -H, GCC will print out which files it is actually including:
. /usr/lib/avr/include/avr/io.h
.. /usr/lib/avr/include/avr/sfr_defs.h
... /usr/lib/avr/include/inttypes.h
.... /usr/lib/gcc/avr/5.4.0/include/stdint.h
..... /usr/lib/avr/include/stdint.h
.. /usr/lib/avr/include/avr/iom8.h
.. /usr/lib/avr/include/avr/portpins.h
.. /usr/lib/avr/include/avr/common.h
.. /usr/lib/avr/include/avr/version.h
.. /usr/lib/avr/include/avr/fuse.h
.. /usr/lib/avr/include/avr/lock.h
avr-gcc -save-temps -g3 -mmcu=atmega8 foo.c ...
With DWARF-3 debugging info, the macro definitions will be recorded in the debug info and are visible in the pre-processed file (*.i for C code, *.ii for C++, *.s for pre-processed assembly). Hence, in foo.i we can find the definition of PORTD as
#define PORTD _SFR_IO8(0x12)
Starting from the line which contains that definition, scroll up until you find the annotation that tells in which file the macro definition happened. For example
# 45 "/usr/lib/avr/include/avr/iom8.h" 3
in the case of my toolchain installation. This means that the lines following that annotation follow line 45 of /usr/lib/avr/include/avr/iom8.h.
If you want to see the resolution of PORTD, scroll down to the end of foo.i which contains the pre-processed source:
# 3 "foo.c"
int main (void)
{
(*(volatile uint8_t *)((0x12) + 0x20)) = 0;
}
0x12 is the I/O address of PORTD, and 0x20 is the offset between I/O addresses and RAM addresses for ATmega8. This means the compiler may implement PORTD = 0 by means of out 0x12, __zero_reg__.
avr-gcc -print-file-name=libc.a -mmcu=...
Finally, this command will print the location (absolue path) of libraries like libc.a, libm.a, libgcc.a or lib<mcu>.a. The location of the library depends on how the compiler was configureed and installed, but also on command line options like -mmcu=.
avr-gcc -Wl,-Map,foo.map -mmcu=atmega8 foo.c -o foo.elf
This directs the linker to dump a "map" file foo.map where it reports which symbol will drag which module from which library. This is a text file that contains lines like:
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/crtatmega8.o
...
LOAD /usr/lib/gcc/avr/5.4.0/avr4/libgcc.a
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/libm.a
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/libc.a
LOAD /usr/lib/gcc/avr/5.4.0/../../../avr/lib/avr4/libatmega8.a
libgcc.a is from the compiler's C runtime, and all the others are provided by AVR-LibC. Resolving the ..s, the AVR-LibC files for ATmega8 are located in /usr/lib/avr/lib/avr4/.
Related
I have a header foo.h with functions bar(), baz(), qux(). Where would I copy it/what would I have to do it so that I can include it in C programs like other systemwide headers, like stdio.h, unistd.h etc?
From the GCC documentation (I am assuming you are using GCC since you included the Linux tag):
2.3 Search Path
GCC looks in several different places for headers. On a normal Unix system, if you do not instruct it otherwise, it will look for headers requested with #include in:
/usr/local/include
libdir/gcc/target/version/include
/usr/target/include
/usr/include
[...] In the above, target is the canonical name of the system GCC was configured to compile code for; often but not always the same as the canonical name of the system it runs on. version is the version of GCC in use.
So that mostly answers your question. But really, you probably shouldn't be putting non-system headers in places like /usr/include. Most of the time, it's best to keep the headers for your program in the include sub-directory for the project. Then tell GCC how to find those files like this:
You can add to this list with the -Idir command line option. All the directories named by -I are searched, in left-to-right order, before the default directories. The only exception is when dir is already searched by default. In this case, the option is ignored and the search order for system directories remains unchanged.
[...]
Also keep in mind the differences between #include "file.h" and #include <file.h>
GCC looks for headers requested with #include "file" first in the directory containing the current file, then in the directories as specified by -iquote options, then in the same places it would have looked for a header requested with angle brackets. For example, if /usr/include/sys/stat.h contains #include "types.h", GCC looks for types.h first in /usr/include/sys, then in its usual search path.
[...]
I used the GCC -E command, and I can see the #include files are pasted in after preprocessing. But when I use the GCC -S command, in the generated assemble file(.s), I can't find information about my header files.(More specifically, whether I comment the #include instruction, I get the same .s file).
Next step I can use gcc -o *.s to assemble and link my .s file. But where did GCC get the header file information?
The #include statements in the preprocessed outputs are there to link to the header file in case the compiler finds an error and wants to notify the user about the specific location of the error ("included in xxx.h")
But all code/declarations contained in the #include (provided they match the proper #ifdef/#if conditions) are expanded in the preprocessed output. Only this code/declaration stuff is used by the compiler to produce the assembly / binary object file, no more need for the headers at that point.
So your assembly code has already integrated the information of the header files (structure offsets, constants, type sizes...) and it's no longer C anymore, it's assembly.
I want to use functions in the header files gmp.h and mpfr.h, which are in the file /opt/local/include.
But when I run gcc with -v, all of the search paths are something like /Application/Xcode.app/Contents/etc.
I have tried adding LD_LIBRARY_PATH="/opt/local/include" to .bash_profile but it doesn't work. The compiler either tells me that 'gmp.h' file not found, or Undefined symbols for architecture x86_64.
What should I do?
Converting comments into an answer.
You need to add -I/opt/local/include to compile commands (to specify where the headers are) and -L/opt/local/lib and -lgmp and -lmpfr (possibly in the reverse order — MPFR before GMP) to link commands.
That works! Would you mind explaining a little bit the logic behind this? For example if I had another header file header.h I need, how should I include it?
You include it with #include "header.h". You compile the code with -I/directory/containing/header to find the header. You specify where the library (libheader.a or libheader.dylib, since you seem to be on macOS) is too, with -L/directory/containing/lib and -lheader — or whatever is appropriate.
The -I tells the preprocessor to look in the named directory for header files, so it looks for /directory/containing/header/header.h, for example.
The -L tells the linker where to find libraries (so it looks for /directory/containing/lib/libheader.dylib etc).
The -lheader tells the linker to look for libheader.a or libheader.dylib (or local equivalents) for the libraries.
Except for the use of .dylib vs .so vs .dll vs … (and .a vs .lib vs …), the same principles apply to other systems too.
This is probably a duplicate.
What if I want these externals to be resolved in runtime with dlopen?
Im trying to understand why including an h file, with shared library external vars and funcs, to a C executable program results in undefined/unresolved. (when linking)
Why do I have to add "-lsomelib" flag to the gcc linkage if I only want these symbols to be resolved in runtime.
What does the link time linker need these deffinitions resolutions for. Why cant it wait for the resolution in runtime when using dlopen.
Can anyone help me understand this?
Here something that may help understanding:
there are 3 types of linking:
static linking (.a): the compiler includes the content of the library into your code at link time so that you can move the code to other computers with the same architecture and run it.
dynamic linking (.so): the compiler resolves the symbols at link time (during compilation); but the does not includes the code of the library in your executable. When the program is started, the library is loaded. And if the library is not found the program stop. You need the library on the computer that is running the program
dynamic loading: You are in charge of loading the library functions at runtime, using dlopen and etc. Specially used for plugins
see also: http://www.ibm.com/developerworks/library/l-dynamic-libraries/ and
Difference between shared objects (.so), static libraries (.a), and DLL's (.so)?
A header file (e.g. an *.h file referenced by some #include directive) is relevant to the C or C++ compiler. The linker does not know about source files (which are input to the compiler), but only about object files produced by the assembler (in executable and linkable format, i.e. ELF)
A library file (give by -lfoo) is relevant only at link time. The compiler does not know about libraries.
The dynamic linker needs to know which libraries should be linked. At runtime it does symbol resolution (against a fixed & known set of shared libraries). The dynamic linker won't try linking all the possible shared libraries present on your system (because it has too many shared objects, or because it may have several conflicting versions of a given library), it will link only a fixed set of libraries provided inside the executable. Use objdump(1) & readelf(1) & nm(1) to explore ELF object files and executables, and ldd(1) to understand shared libraries dependencies.
Notice that the g++ program is used both for compilation and for linking. (actually it is a driver program: it starts some cc1plus -the C++ compiler proper- to compile a C++ code to an assembly file, some as -the assembler- to assemble an assembly file into an object file, and some ld -the linker- to link object files and libraries).
Run g++ as g++ -v to understand what it is doing, i.e. what program[s] is it running.
If you don't link the required libraries, at link time, some references remain unresolved (because some object files contain an external reference and relocation).
(things are slightly more complex with link-time optimization, which we could ignore)
Read also Program Library HowTo, Levine's book linkers and loaders, and Drepper's paper: how to write shared libraries
If you use dynamic loading at runtime (by using dlopen(3) on some plugin), you need to know the type and signature of relevant functions (returned by dlsym(3)). A program loading plugins always have its specific plugin conventions. For examples look at the conventions used for geany plugins & GCC plugins (see also these slides about GCC plugins).
In practice, if you are developing your application accepting some plugins, you will define a set of names, their expected type, signature, and role. e.g.
typedef void plugin_start_function_t (const char*);
typedef int plugin_more_function_t (int, double);
then declare e.g. some variables (or fields in a data structure) to point to them with a naming convention
plugin_start_function_t* plustart; // app_plugin_start in plugins
#define NAME_plustart "app_plugin_start"
plugin_more_function_t* plumore; // app_plugin_more in plugins
#define NAME_plumore "app_plugin_more"
Then load the plugin and set these pointers, e.g.
void* plugdlh = dlopen(plugin_path, RTLD_NOW);
if (!plugdlh) {
fprintf(stderr, "failed to load %s: %s\n", plugin_path, dlerror());
exit(EXIT_FAILURE; }
then retrieve the symbols:
plustart = dlsym(plugdlh, NAME_plustart);
if (!plustart) {
fprintf(stderr, "failed to find %s in %s: %s\n",
NAME_plustart, plugin_path, dlerror();
exit(EXIT_FAILURE);
}
plumore = dlsym(plugdlh, NAME_plumore);
if (!plumore) {
fprintf(stderr, "failed to find %s in %s: %s\n",
NAME_plumore, plugin_path, dlerror();
exit(EXIT_FAILURE);
}
Then use appropriately the plustart and plumore function pointers.
In your plugin, you need to code
extern "C" void app_plugin_start(const char*);
extern "C" int app_plugin_more (int, double);
and give a definition to both of them. The plugin should be compiled as position independent code, e.g. with
g++ -Wall -fPIC -O -g pluginsrc1.c -o pluginsrc1.pic.o
g++ -Wall -fPIC -O -g pluginsrc2.c -o pluginsrc2.pic.o
and linked with
g++ -shared pluginsrc1.pic.o pluginsrc2.pic.o -o yourplugin.so
You may want to link extra shared libraries to your plugin.
You generally should link your main program (the one loading plugins) with the -rdynamic link flag (because you want some symbols of your main program to be visible to your plugins).
Read also the C++ dlopen mini howto
I am building a project that builds multiple shared libraries and executable files. All the source files that are used to build these binaries are in a single /src directory. So it is not obvious to figure out which source files were used to build each of the binaries (there is many-to-many relation).
My goal is to write a script that would parse a set of C files for each binary and make sure that only the right functions are called from them.
One option seems to be to try to extract this information from Makefile. But this does not work well with generated files and headers (due to dependence on Includes).
Another option could be to simply browse call graphs, but this would get complicated, because a lot of functions are called by using function pointers.
Any other ideas?
You can first compile your project with debug information (gcc -g) and use objdump to get which source files were included.
objdump -W <some_compiled_binary>
Dwarf format should contain the information you are looking for.
<0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
< c> DW_AT_producer : (indirect string, offset: 0x5f): GNU C 4.4.3
<10> DW_AT_language : 1 (ANSI C)
<11> DW_AT_name : (indirect string, offset: 0x28): test_3.c
<15> DW_AT_comp_dir : (indirect string, offset: 0x36): /home/auselen/trials
<19> DW_AT_low_pc : 0x82f0
<1d> DW_AT_high_pc : 0x8408
<21> DW_AT_stmt_list : 0x0
In this example, I've compiled object file from test_3, and it was located in .../trials directory. Then of course you need to write some script around this to collect related source file names.
First you need to separate the debug symbols from the binary you just compiled. check this question on how to do so:
How to generate gcc debug symbol outside the build target?
Then you can try to parse this file on your own. I know how to do so for Visual Studio but as you are using GCC I won't be able to help you further.
Here is an idea, need to refine based on your specific build. Make a build, log it using script (for example script log.txt make clean all). The last (or one of the last) step should be the linking of object files. (Tip: look for cc -o <your_binary_name>). That line should link all .o files which should have corresponding .c files in your tree. Then grep those .c files for all the included header files.
If you have duplicate names in your .c files in your tree, then we'll need to look at the full path in the linker line or work from the Makefile.
What Mahmood suggests below should work too. If you have an image with symbols, strings <debug_image> | grep <full_path_of_src_directory> should give you a list of C files.
You can use unix nm tool. It shows all symbols that are defined in the object. So you need to:
Run nm on your binary and grab all undefined symbols
Run ldd on your binary to grab list of all its dynamic dependencies (.so files your binary is linked to)
Run nm on each .so file youf found in step 2.
That will give you the full list of dynamic symbols that your binary use.
Example:
nm -C --dynamic /bin/ls
....skipping.....
00000000006186d0 A _edata
0000000000618c70 A _end
U _exit
0000000000410e34 T _fini
0000000000401d88 T _init
U _obstack_begin
U _obstack_newchunk
U _setjmp
U abort
U acl_extended_file
U bindtextdomain
U calloc
U clock_gettime
U closedir
U dcgettext
U dirfd
All those symbols with capital "U" are used by ls command.
If your goal is to analyze C source files, you can do that by customizing the GCC compiler. You could use MELT for that purpose (MELT is a high-level domain specific language to extend GCC) -adding your own analyzing passes coded in MELT inside GCC-, but you should first learn about GCC middle-end internal representations (Gimple, Tree, ...).
Customizing GCC takes several days of work (mostly because GCC internals are quite complex in the details).
Feel free to ask me more about MELT.