Find all symbols in a directory - c

I am looking to figure out which C library to include when compiling a program that includes it as a header, in this case #include <pcre2.h>. The only way I've been able to figure out where the file is I need is to check for a specific symbol that I know needs to be exported. For example:
$ ls
CMakeCache.txt Makefile install_manifest.txt libpcre2-posix.pc pcre2_grep_test.sh
CMakeFiles a.out libpcre2-8.a pcre2-config pcre2_test.sh
CTestCustom.ctest cmake_install.cmake libpcre2-8.pc pcre2.h pcre2grep
CTestTestfile.cmake config.h libpcre2-posix.a pcre2_chartables.c pcre2test
$ objdump -t libpcre2-8.a|grep pcre2_compile
pcre2_compile.c.o: file format elf64-x86-64
0000000000000000 l df *ABS* 0000000000000000 pcre2_compile.c
00000000000100bc g F .text 00000000000019dd pcre2_compile_8
0000000000000172 g F .text 00000000000000e3 pcre2_compile_context_create_8
0000000000000426 g F .text 0000000000000055 pcre2_compile_context_copy_8
0000000000000557 g F .text 0000000000000032 pcre2_compile_context_free_8
And because the symbol pcre2_compile_8 exists in that file (after trying every other file...) I know that the library I need to include is pcre2-8, that is, I compile my code with:
$ gcc myfile.c -lpcre2-8 -o myfile; ./myfile
Two questions related to this:
Is there a simpler way to find a symbols in a batch of files (some of which are not elf files)? For example, something like objdump -t *? Or what's the closest thing to doing that?
Is there a better way to find out what the library value of -l<library> is? Or, what's the common way when someone downloads a new C program that they know what to add to their command-line so that the program works? (For me, I've just spent the last hour figuring out that it's -lpcre2-8 and not -lpcre or -lpcre2.

Usually, the function you call from the library will be a symbol defined by that library. But in PCRE2, due to different code unit sizes, the function you call (e.g. pcre2_compile) actually becomes a different symbol through preprocessor macros (e.g. pcre2_compile_8). You can find the symbol you need from the library by compiling your program and checking the undefined symbols:
$ cat test.c
#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
int main() {
pcre2_compile("",0,0,NULL,NULL,NULL);
}
$ gcc -c test.c
$ nm -u test.o
U _GLOBAL_OFFSET_TABLE_
U pcre2_compile_8
Is there a simpler way to find a symbols in a batch of files?
You can search a directory (/usr/lib/ below) for the library files (.a or .so extension below), running nm for each and search for the undefined symbol (adapted from this question):
$ for lib in $(find /usr/lib/ -name \*.a -o -name \*.so)
> do
> nm -A --defined-only $lib 2>/dev/null| grep pcre2_compile_8
> done
/usr/lib/x86_64-linux-gnu/libpcre2-8.a:libpcre2_8_la-pcre2_compile.o:0000000000007f40 T pcre2_compile_8
Is there a better way to find out what the library value of -l is?
It is usually conveyed through the library documentation. For PCRE2, the second page of the documentation talks about the pcre-config tool that gives the appropriate flags:
pcre2-config returns the configuration of the installed PCRE2 libraries and the options required to compile a program to use them. Some of the options apply only to the 8-bit, or 16-bit, or 32-bit libraries, respectively, and are not available for libraries that have not been built.
[...]
--libs8 Writes to the standard output the command line options required to link with the 8-bit PCRE2 library (-lpcre2-8 on many systems).
[...]
--cflags Writes to the standard output the command line options required to compile files that use PCRE2 (this may include some -I options, but is blank on many systems).
So for this particular library, the recommended way to build and link is:
gcc -c $(pcre2-config --cflags) test.c -o test.o
gcc test.o -o test $(pcre2-config --libs8)

Related

C compiler not finding gmp linked library

I'm trying to compile a simple C program on an M1 Mac that includes gmp.h. For whatever reason, however, it continuously says that it can't find the library. I checked to see if it exists via locate libgmp.a to which it says it is located in /opt/local/lib/libgmp.a. I also attempted to install it via brew to no avail. I really don't understand what I'm doing wrong here. I also checked the PATH environment variable to make sure this path was included, which it is.
The output of find /opt/local -type f -name "gmp.h" is /opt/local/include/gmp.h, and the output of find /opt/local -type f -name "libgmp*" is
/opt/local/libgmp.a
/opt/local/lib/libgmpxx.4.dylib
/opt/local/lib/libgmp.a
/opt/local/lib/libgmpxx.a
/opt/local/lib/libgmp.10.dylib
I compiled by linking via -L/opt/local/lib -lgmp does not fix the issue.
Command used: gcc main.c -o main.o -L/opt/local/lib -lgmp
To compile a program that relies on a header file:
#include <gmp.h>
located in a non-standard location /usr/local/include/gmp.h you need to tell the compiler where to find it by setting the directory where to find the header file:
gcc -c main.c -I/usr/local/include
Note: gpp -x c -v
To link the program, you need to tell the compiler which library to use (-l) and where to find that library (-L) which in this case /usr/local/lib/gmp.a:
gcc main.o -o main -L/usr/local/lib -lgmp
Note: gcc -print-search-dirs | grep ^libraries
If you are using a dynamic library (.so), you may also need to tell the linker where to find the library at run-time:
export LD_LIBRARY_PATH=/usr/local/lib
./main

dietlibc, lowfat, opentracker - compiling against alternative libc

I'm attempting to build opentracker. My system has the following:
| package | library | headers |
| lowfat | /usr/lib/libowfat.a | /usr/include/libowfat |
| dietlibc | /opt/diet/lib-x86_64/*.a | /usr/diet/include |
| glibc | /usr/lib/*.{a,so} | /usr/include |
Looking at the Makefile for opentracker, I see (essentially) the following:
PREFIX?=..
LIBOWFAT_HEADERS=$(PREFIX)/libowfat
LIBOWFAT_LIBRARY=$(PREFIX)/libowfat
CFLAGS+=-I$(LIBOWFAT_HEADERS) -Wall -pipe -Wextra
LDFLAGS+=-L$(LIBOWFAT_LIBRARY) -lowfat -pthread -lpthread -lz
opentrackers: $(OBJECTS) $(HEADERS)
cc -o $# $(OBJECTS) $(LDFLAGS)
I've not compiled against an alternative libc before, so I'm including this information in case I've done this part wrong. When I invoke make, I need to point it at where my system has dietlibc and lowfat live. I'm doing it like this:
$ LDFLAGS=-L/opt/diet/lib-x86_64 make PREFIX=/opt/diet LIBOWFAT_HEADERS=/usr/include/libowfat LIBOWFAT_LIBRARY=/usr/lib
...
...
cc -o opentracker opentracker.o trackerlogic.o scan_urlencoded_query.o ot_mutex.o ot_stats.o ot_vector.o ot_clean.o ot_udp.o ot_iovec.o ot_fullscrape.o ot_accesslist.o ot_http.o ot_livesync.o ot_rijndael.o -L/opt/diet/lib-x86_64 -L/usr/lib -lowfat -pthread -lpthread -lz
/usr/bin/ld: /usr/lib/libowfat.a(io_fd.o):(.bss+0xb0): multiple definition of `first_deferred'; /usr/lib/libowfat.a(io_close.o):(.data+0x0): first defined here
...
... lots of warnings ...
/usr/bin/ld: opentracker.o: undefined reference to symbol '__ctype_b_loc##GLIBC_2.3'
/usr/bin/ld: /usr/lib/libc.so.6: error adding symbols: DSO missing from command line
Looks like there's two issues going on in there.
Multiple definitions of first_deferred
I see references to first_deferred in both io_close and io_fd, but they are in different sections.
$ objdump -t /usr/lib/libowfat.a | egrep '^[^:]+.o:|first_deferred' | grep -B1 first_deferred
io_close.o: file format elf64-x86-64
0000000000000000 g O .data 0000000000000008 first_deferred
--
io_fd.o: file format elf64-x86-64
00000000000000b0 g O .bss 0000000000000008 first_deferred
--
io_waituntil2.o: file format elf64-x86-64
0000000000000000 *UND* 0000000000000000 first_deferred
In io/io_fd.c, there's an #include io_internal.h and in that header there's an extern long first_deferred;. In io/io_close.c it's defined as long first_deferred=-1. So it doesn't look like it's double defined in the libowfat code itself. Did I compile lowfat wrong?
DSO missing from command line / symbol '__ctype_b_loc##GLIBC_2.3'
Since the Makefile is trying to compile against dietlibc, I'm a bit surprised that there's a reference to glibc (but, to be honest, also not surprised at all).
Here's the recipe for opentracker.o:
cc -c -o opentracker.o -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -I/usr/include/libowfat -Wall -pipe -Wextra -O3 -DWANT_FULLSCRAPE opentracker.c
This doesn't appear to have the -L/opt/diet/lib-x86_64 argument from LDFLAGS that is used for the main executable. Should it? I don't think so as that's a linker argument so it would not make sense to add it to the compile command. I don't see any references to glibc in the object file:
$ objdump -t ./src/opentracker/opentracker.o | grep -c 'glib'
0
DSO missing from command line / symbol '__ctype_b_loc##GLIBC_2.3'
I found two permutations to solve this issue. Option one is to make sure the very first -L argument is the location of dietlibc's lib directory, so that all symbols are resolved from there first.
The other permutation was to invoke make via the /opt/diet/bin/diet wrapper program. From the dietlibc FAQ
Q: How do I install it? make install?
A: Yep. It will then install itself to /opt/diet, with the wrapper in
/opt/diet/bin/diet. Or you don't install it at all.
The diet libc comes with a wrapper called "diet", which can be found
in bin-$(ARCH)/diet, i.e. bin-i386/diet for most of us. Copy this
wrapper somewhere in your path (for example ~/bin) and then just
compile stuff by prepending diet to the command line, e.g. "diet gcc
-pipe -g -o t t.c".
Q: How do I compile programs using autoconf with the diet libc?
A: Set CC in the environment properly. For Bourne Shells:
$ CC="diet gcc -nostdinc" ./configure --disable-nls
That should be enough, but you might also want to set
--disable-shared and --enable-static for packages using libtool.
It's not explained anywhere on the website, as far as I can tell, what the wrapper program does. The code is annoying to read due to all the architecture specific #ifdefs, but the file comment indicates it just modifies the gcc command line in an architecture specific way. A quick scan suggests relevant args modifications include: -I/opt/diet/include when compiling, -nostdlib when linking, and possibly -Os.
Multiple definitions of first_deferred
I'm not happy with my workaround here. The symbol is defined in io_internal.h:
#ifndef my_extern
#define my_extern extern
#endif
my_extern long first_deferred;
Why is there a funny redefinition of the extern keyword? Read on. The initialization of this variable is in io_close.c:
#include "io_internal.h"
long first_deferred=-1;
And here's the interesting bit. In io_fd.c:
#define my_extern
#include "io_internal.h"
#undef my_extern
Why? Who knows. The author believes they are clever I guess and saved themselves some keystrokes? The effect of this is that my_extern is defined as an empty string, so when my_extern long first_deferred; is transcluded from the header, it appears as long first_deferred;. This is what leads there to be two locations for the symbol in the archive, as there are two files that reserve space for that symbol.
I'm not happy with my "solution", which was to remove the static initialization from io_close.c. Technically, that means the variable starts with random heap memory. A quick look at how it gets used suggests this is maybe not safe, but is probably safe enough. The variable is used as an index into an array. Thankfully iarray_get does a bounds check, so it's very likely that if(e) will be false and the variable will get set to -1 as it should be.
if (first_deferred!=-1) {
while (first_deferred!=-1) {
io_entry* e=iarray_get(&io_fds,first_deferred);
if (e) {
if (e->closed) {
e->closed=0;
close(first_deferred);
}
first_deferred=e->next_defer;
} else
first_deferred=-1; // can't happen
}
}
I can't provide a good explanation for those errors, but your post helped me to get it to compile so I figured I'd mention what I did.
The "first_deferred" error seems to come from using a newer version of libowfat, I got past that by using 0.31 instead.
I didn't come across the second error, but I was getting "__you_tried_to_link_a_dietlibc_object_against_glibc" errors which I got past by uninstalling dietlibc and compiling libowfat with glibc instead.
I compiled them the same way as the AUR packages:
https://aur.archlinux.org/packages/opentracker/
https://aur.archlinux.org/packages/libowfat/
Although, instead of installing libowfat, I just put it in the src directory and skipped fetching libowfat from CVS.

Correct usage of strip tool

Unpacked Linaro GCC 6.2-2016.11 toolchain occupies almost 3.4 GB of disc space and I want to make it smaller. My target is armv7-a+vfpv3+hard_float so I have already removed things I do not need (like ld.gold, libraries for Thumb, v8-a, v7ve etc) but it still occupies almost 1 GB.
So I want to use strip tool to remove redundant information from its binaries.
My main question is how to use strip safety, correctly and efficiently in this case?
In general we have different binary files in toolchain to which we can apply strip: *.exe, *.a, *.o.
As I can judge I can apply strip -s (remove all symbols) only to *.exe files (i.g. arm-eabi-gcc.exe). Am I right?
Is it possible to apply strip to libraries (i.g. libgcc.a)?
As I understood (see example above) symbols in libraries may be needed for further processing.
If yes, should I use --strip-debug instead (remove debugging symbols only)?
Example below illustrates these questions and reveals more.
Assume that we have three files:
// main.c:
#include "libgcc_test.h"
int main(void)
{
do_something();
return 0;
}
// libgcc_test.c:
void do_something(void)
{
return;
}
// libgcc_test.h:
void do_something(void);
In general we just compile each file separately to get object files which can be linked together:
$ ./arm-eabi-gcc.exe main.c -c
$ ./arm-eabi-gcc.exe libgcc_test.c -c
By analyzing object files we can see that do_something symbol is defined in libgcc_test.o and undefined in main.o, as expected:
$ ./arm-eabi-nm.exe main.o
U do_something
00000000 T main
$ ./arm-eabi-nm.exe libgcc_test.o
00000000 T do_something
If we apply strip -s to both files or only to main.o and try to link them, it works:
$ ./arm-eabi-nm.exe main.o
arm-eabi-nm.exe: main.o: no symbols
$ ./arm-eabi-nm.exe libgcc_test.o
arm-eabi-nm.exe: libgcc_test.o: no symbols
$ ./arm-eabi-ld.exe libgcc_test.o main.o -o main
arm-eabi-ld.exe: warning: cannot find entry symbol _start; defaulting to 00008000
But if we apply strip -s only to libgcc_test.o, linker produces error message:
$ ./arm-eabi-strip.exe -s libgcc_test.o
$ ./arm-eabi-ld.exe libgcc_test.o main.o -o main
arm-eabi-ld.exe: warning: cannot find entry symbol _start; defaulting to 00008000
main.o: In function `main':
main.c:(.text+0x8): undefined reference to `do_something'
As I understand presence of unresolved symbol in an object file forces linker to resolve it. What happens if I remove this symbol from an object file before linking?
Is it correct and safe to remove symbols from object files before linking them together? If yes, which symbols can be removed?
In real project if we apply strip -s to toolchain libraries (libgcc.a, libc.a, libm.a, librdimon.a etc) it similarly produces a lot of "undefined reference to..." messages during linking stage.
But if we use --strip-debug option, linker produces messages for libraries like skipping incompatible libgcc.a when searching for -lgcc. If we revert libraries, it is linked successfuly.
What does skipping incompatible... message mean in this case?
Thank you for help.
Just to summarize how I did it. Maybe it will be useful for someone.
I just removed unnecessary libraries/executables and applied strip -s to *.exe files only.
After that all the toolchain appeared to be ~230 MB.

How to verify that an object file was generated by g++ and not by gcc

On my linux box (ubuntu), is there any way to verify that an object file was created from my c sources using g++ and not via gcc?
With readelf, I'm able to read the gcc version that was used to create the object file. unfortunately it does not tell if g++ or gcc was used. maybe there is a debug option to put that info in the comment somehow?
You can set preprocessor macros in your makefile or GCC commandline (whether gcc or g++) which set static constant strings in each object file, to record how things were built and what sourcefile version (if you care about maintenance & support in field). Copy the way source code control system info gets stored in some FOSS programs.
Then you can use the strings(1) to check the info saved.
GCC, provides both driver programs gcc & g++, to compile and/or link C & C++, setting appropriate defaults, but options mean C++ can be compiled using gcc, as well as C (often) being valid input to a C++ compiler. Options can be recorded by compiler switches -frecord-gcc-switches on some targets, or the debugging option -grecord-gcc-switches but compilers are compared on how lean and fast their output is, so you have to ask explicitly for such "bloat" and it may not even care which command was used to invoke the compiler.
For a built exectutable, ldd can be useful, nm for object as suggested in comments
ldd ./sob
linux-vdso.so.1 (0x00007fffc37fe000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f7e99378000)
libm.so.6 => /lib64/libm.so.6 (0x00007f7e99075000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f7e98e5e000)
libc.so.6 => /lib64/libc.so.6 (0x00007f7e98aaf000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7e99680000)
nm ./sob.o
U __cxa_atexit
U __dso_handle
..
U _ZSt4cerr
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000008 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
nm ./mallocstats.o
U fprintf
U mallinfo
0000000000000000 T mallocstats
U stderr
THe libstdc++.so.6 suggests ./sob is C++ and that's correct. Mangled names and things like cerr/cout and basic_ostream, really indicate C++. The C-ness of mallocstats.o is not in doubt either.
Given addnum.c:
int adnum(int a, int b) {
return a + b;
}
Use nm addnum.o to identify if the object file of addnum.c is compiled by gcc (by gcc -c addnum.c) or g++ (by g++ -c addnum.c).
If compiled by gcc:
nm addnum.o
0000000000000000 T addnum
If compiled by g++:
nm addnum.o
0000000000000000 T _Z6addnumii
That is, the name mangling of the addnum function name is the tell.
Here is a simple solution (workaround) that came to mind. You need to add the following into source code:
#ifdef __cplusplus
extern "C" { int __compiled_using_cplusplus; }
#endif
You can find __compiled_using_cplusplus symbol in object file only if the source was compiled using g++:
alexander#ubuntu:tmp4$ nm test-g++.o
0000000000000000 B __compiled_using_cplusplus
alexander#ubuntu:tmp4$ nm test-gcc.o
alexander#ubuntu:tmp4$ nm test-g++.o | grep -q '__compiled_using_cplusplus' && echo "c++ compiler"
c++ compiler
alexander#ubuntu:tmp4$ nm test-gcc.o | grep -q '__compiled_using_cplusplus' && echo "c++ compiler"

"No symbol version section for versioned symbol"

I'm attempting to cross-compile my own shared library (libmystuff.so) against another shared library (libtheirstuff.so) that makes use of the libcurl shared library and am getting the following error:
libmystuff.so: No symbol version section for versioned symbol
'curl_global_init##CURL_OPENSSL_3'
Which is then followed by:
final link failed: Nonrepresentable section on output.
Going through the code that creates libtheirstuff, I can see that curl_global_init is the first reference to curl.
Doing ldd libtheirstuff.so on the target platform (arm5) shows that it can find all of the references.
What's going on here?
Edit: Here are the calls to gcc
arm-none-linux-gnueabi-gcc -fPIC -c mystuff_impl.c -o mystuff_impl.o -I/home/me/arm/include
arm-none-linux-gnueabi-gcc -shared -Wl,soname=libmystuff.so -o libmystuff.so.0.1 mystuff_impl.o -L/home/me/arm/lib -ltheirstuff
Linker was grabbing the wrong version.
This problem (No symbol version section for versioned symbol
'curl_global_init##CURL_OPENSSL_3') also appears when you are trying to compile a binary that will work on multiple Linux distributions. You can check for the problem like this:
$ objdump -x mybinary | grep curl_global_init
0... F *UND* 0... curl_global_init##CURL_OPENSSL_3
The solution in this case is to build on a machine where libcurl has been compiled with ./configure --disable-versioned-symbols. Binaries compiled this way will work elsewhere (including on systems which use versioned symbols). A portable binary should produce output like this (without any # signs):
$ objdump -x mybinary | grep curl_global_init
0... F *UND* 0... curl_global_init

Resources