I have a C library (with C headers) which exists in two different versions.
One of them has a function that looks like this:
int test(char * a, char * b, char * c, bool d, int e);
And the other version looks like this:
int test(char * a, char * b, char * c, bool d)
(for which e is not given as function parameter but it's hard-coded in the function itself).
The library or its headers do not define / include any way to check for the library version so I can't just use an #if or #ifdef to check for a version number.
Is there any way I can write a C program that can be compiled with both versions of this library, depending on which one is installed when the program is compiled? That way contributors that want to compile my program are free to use either version of the library and the tool would be able to be compiled with either.
So, to clarify, I'm looking for something like this (or similar):
#if HAS_ARGUMENT_COUNT(test, 5)
test("a", "b", "c", true, 20);
#elif HAS_ARGUMENT_COUNT(test, 4)
test("a", "b", "c", true);
#else
#error "wrong argument count"
#endif
Is there any way to do that in C? I was unable to figure out a way.
The library would be libogc ( https://github.com/devkitPro/libogc ) which changed its definition of if_config a while ago, and I'd like to make my program work with both the old and the new version. I was unable to find any version identifier in the library. At the moment I'm using a modified version of GCC 8.3.
This should be done at the configure stage, using an Autoconf (or CMake, or whatever) test step -- basically, attempting to compile a small program which uses the five-parameter signature, and seeing if it compiles successfully -- to determine which version of the library is in use. That can be used to set a preprocessor macro which you can use in an #if block in your code.
I think there's no way to do this at the preprocesing stage (at least not without some external scripts). On the other hand, there is a way to detect a function's signature at compiling time if you're using C11: _Generic. But remember: you can't use this in a macro like #if because primary expressions aren't evaluated at the preprocessing stage, so you can't dynamically choose to call the function with signature 1 or 2 in that stage.
#define WEIRD_LIB_FUNC_TYPE(T) _Generic(&(T), \
int (*)(char *, char *, char *, bool, int): 1, \
int (*)(char *, char *, char *, bool): 2, \
default: 0)
printf("test's signature: %d\n", WEIRD_LIB_FUNC_TYPE(test));
// will print 1 if 'test' expects the extra argument, or 2 otherwise
I'm sorry if this does not answer your question. If you really can't detect the version from the "stock" library header file, there are workarounds where you can #ifdef something that's only present in a specific version of that library.
This is just a horrible library design.
Update: after reading the comments, I should clarify for future readers that it isn't possible in the preprocessing stage but it is possible at compile time still. You'd just have to conditionally cast the function call based on my snippet above.
typedef int (*TYPE_A)(char *, char *, char *, bool, int);
typedef int (*TYPE_B)(char *, char *, char *, bool);
int newtest(char *a, char *b, char *c, bool d, int e) {
void (*func)(void) = (void (*)(void))&test;
if (_Generic(&test, TYPE_A: 1, TYPE_B: 2, default: 0) == 1) {
return ((TYPE_A)func)(a, b, c, d, e);
}
return ((TYPE_B)func)(a, b, c, d);
}
This indeed works although it might be controversial to cast a function this way. The upside is, as #pizzapants184 said, the condition will be optimized away because the _Generic call will be evaluated at compile-time.
I don't see any way to do that with standard C, if you are compiling with gcc a very very ugly way can be using gcc aux-info in a command and passing the number of parameters with -D:
#!/bin/sh
gcc -aux-info output.info demo.c
COUNT=`grep "extern int foo" output.info | tr -dc "," | wc -m`
rm output.info
gcc -o demo demo.c -DCOUNT="$COUNT + 1"
./demo
This snippet
#include <stdio.h>
int foo(int a, int b, int c);
#ifndef COUNT
#define COUNT 0
#endif
int main(void)
{
printf("foo has %d parameters\n", COUNT);
return 0;
}
outputs
foo has 3 parameters
Attempting to support compiling code with multiple versions of a static library serves no useful purpose. Update your code to use the latest release and stop making life more difficult than it needs to be.
In Dennis Ritchie's original C language, a function could be passed any number of arguments, regardless of the number of parameters it expected, provided that the function didn't access any parameters beyond those that were passed to it. Even on platforms whose normal calling convention wouldn't be able to accommodate this flexibility, C compilers would generally used a different calling convention that could support it unless functions were marked with qualifiers like pascal to indicate that they should use the ordinary calling convention.
Thus, something like the following would have had fully defined behavior in Ritchie's original C language:
int addTwoOrThree(count, x, y, z)
int count, x, y, z;
{
if (count == 3)
return x+y+z;
else
return x+y;
}
int test()
{
return count(2, 10,20) + count(3, 1,2,3);
}
Because there are some platforms where it would be impractical to support such flexibility by default, the C Standard does not require that compilers meaningfully process any calls to functions which have more or fewer arguments than expected, except that functions which have been declared with a ... parameter will "expect" any number of arguments that is at least as large as the number of actual specified parameters. It is thus rare for code to be written that would exploit the flexibility that was present in Ritchie's language. Nonetheless, many implementations will still accept code written to support that pattern if the function being called is in a separate compilation unit from the callers, and it is declared but not prototyped within the compilation units that call it.
you don't.
the tools you're working with are statically linked and don't support versioning.
you can get around it using all kind of tricks and tips that have been mentioned, but at the end of the day they are ugly patch works of something you're trying to do that makes no sense in this context(toolkit/code environment).
you design your code for the version of the toolkit you have installed. its a hard requirement. i also don't understand why you would want to design your gamecube/wii code to allow building on different versions.
the toolkit is constantly changing to fix bugs, assumptions etc etc.
if you want your code to use an old version that potentially have bugs or do things wrong, that is on you.
i think you should realize what kind of botch work you're dealing with here if you need or want to do this with an constantly evolving toolkit..
I also think, but this is because i know you and your relationship with DevKitPro, i assume you ask this because you have an older version installed and your CI builds won't work because they use a newer version (from docker). its either this, or you have multiple versions installed on your machine for a different project you build (but won't update source for some odd reason).
If your compiler is a recent GCC, e.g. some GCC 10 in November 2020, you might write your own GCC plugin to check the signature in your header files (and emit appropriate and related C preprocessor #define-s and/or #ifdef, à la GNU autoconf). Your plugin could (for example) fill some sqlite database and you would later generate some #include-d header file.
You then would set up your build automation (e.g. your Makefile) to use that GCC plugin and the data it has computed when needed.
For a single function, such an approach is overkill.
For some large project, it could make sense, in particular if you also decide to also code some project-specific coding rules validator in your GCC plugin.
Writing a GCC plugin could take weeks of your time, and you may need to patch your plugin source code when you would switch to a future GCC 11.
See also this draft report and the European CHARIOT and DECODER projects (funding the work described in that report).
BTW, you might ask the authors of that library to add some versioning metadata. Inspiration might come from libonion or Glib or libgccjit.
BTW, as rightly commented in this issue, you should not use an unmaintained old version of some opensource library. Use the one that is worked on.
I'd like to make my program work with both the old and the new version.
Why?
making your program work with the old (unmaintained) version of libogc is adding burden to both you and them. I don't understand why you would depend upon some old unmaintained library, if you can avoid doing that.
PS. You could of course write a plugin for GCC 8. I do recommend switching to GCC 10: it did improve.
I'm not sure this solves your specific problem, or helps you at all, but here's a preprocessor contraption, due to Laurent Deniau, that counts the number of arguments passed to a function at compile time.
Meaning, something like args_count(a,b,c) evaluates (at compile time) to the constant literal constant 3, and something like args_count(__VA_ARGS__) (within a variadic macro) evaluates (at compile time) to the number of arguments passed to the macro.
This allows you, for instance, to call variadic functions without specifying the number of arguments, because the preprocessor does it for you.
So, if you have a variadic function
void function_backend(int N, ...){
// do stuff
}
where you (typically) HAVE to pass the number of arguments N, you can automate that process by writing a "frontend" variadic macro
#define function_frontend(...) function_backend(args_count(__VA_ARGS__), __VA_ARGS__)
And now you call function_frontend() with as many arguments as you want:
I made you Youtube tutorial about this.
#include <stdint.h>
#include <stdarg.h>
#include <stdio.h>
#define m_args_idim__get_arg100( \
arg00,arg01,arg02,arg03,arg04,arg05,arg06,arg07,arg08,arg09,arg0a,arg0b,arg0c,arg0d,arg0e,arg0f, \
arg10,arg11,arg12,arg13,arg14,arg15,arg16,arg17,arg18,arg19,arg1a,arg1b,arg1c,arg1d,arg1e,arg1f, \
arg20,arg21,arg22,arg23,arg24,arg25,arg26,arg27,arg28,arg29,arg2a,arg2b,arg2c,arg2d,arg2e,arg2f, \
arg30,arg31,arg32,arg33,arg34,arg35,arg36,arg37,arg38,arg39,arg3a,arg3b,arg3c,arg3d,arg3e,arg3f, \
arg40,arg41,arg42,arg43,arg44,arg45,arg46,arg47,arg48,arg49,arg4a,arg4b,arg4c,arg4d,arg4e,arg4f, \
arg50,arg51,arg52,arg53,arg54,arg55,arg56,arg57,arg58,arg59,arg5a,arg5b,arg5c,arg5d,arg5e,arg5f, \
arg60,arg61,arg62,arg63,arg64,arg65,arg66,arg67,arg68,arg69,arg6a,arg6b,arg6c,arg6d,arg6e,arg6f, \
arg70,arg71,arg72,arg73,arg74,arg75,arg76,arg77,arg78,arg79,arg7a,arg7b,arg7c,arg7d,arg7e,arg7f, \
arg80,arg81,arg82,arg83,arg84,arg85,arg86,arg87,arg88,arg89,arg8a,arg8b,arg8c,arg8d,arg8e,arg8f, \
arg90,arg91,arg92,arg93,arg94,arg95,arg96,arg97,arg98,arg99,arg9a,arg9b,arg9c,arg9d,arg9e,arg9f, \
arga0,arga1,arga2,arga3,arga4,arga5,arga6,arga7,arga8,arga9,argaa,argab,argac,argad,argae,argaf, \
argb0,argb1,argb2,argb3,argb4,argb5,argb6,argb7,argb8,argb9,argba,argbb,argbc,argbd,argbe,argbf, \
argc0,argc1,argc2,argc3,argc4,argc5,argc6,argc7,argc8,argc9,argca,argcb,argcc,argcd,argce,argcf, \
argd0,argd1,argd2,argd3,argd4,argd5,argd6,argd7,argd8,argd9,argda,argdb,argdc,argdd,argde,argdf, \
arge0,arge1,arge2,arge3,arge4,arge5,arge6,arge7,arge8,arge9,argea,argeb,argec,arged,argee,argef, \
argf0,argf1,argf2,argf3,argf4,argf5,argf6,argf7,argf8,argf9,argfa,argfb,argfc,argfd,argfe,argff, \
arg100, ...) arg100
#define m_args_idim(...) m_args_idim__get_arg100(, ##__VA_ARGS__, \
0xff,0xfe,0xfd,0xfc,0xfb,0xfa,0xf9,0xf8,0xf7,0xf6,0xf5,0xf4,0xf3,0xf2,0xf1,0xf0, \
0xef,0xee,0xed,0xec,0xeb,0xea,0xe9,0xe8,0xe7,0xe6,0xe5,0xe4,0xe3,0xe2,0xe1,0xe0, \
0xdf,0xde,0xdd,0xdc,0xdb,0xda,0xd9,0xd8,0xd7,0xd6,0xd5,0xd4,0xd3,0xd2,0xd1,0xd0, \
0xcf,0xce,0xcd,0xcc,0xcb,0xca,0xc9,0xc8,0xc7,0xc6,0xc5,0xc4,0xc3,0xc2,0xc1,0xc0, \
0xbf,0xbe,0xbd,0xbc,0xbb,0xba,0xb9,0xb8,0xb7,0xb6,0xb5,0xb4,0xb3,0xb2,0xb1,0xb0, \
0xaf,0xae,0xad,0xac,0xab,0xaa,0xa9,0xa8,0xa7,0xa6,0xa5,0xa4,0xa3,0xa2,0xa1,0xa0, \
0x9f,0x9e,0x9d,0x9c,0x9b,0x9a,0x99,0x98,0x97,0x96,0x95,0x94,0x93,0x92,0x91,0x90, \
0x8f,0x8e,0x8d,0x8c,0x8b,0x8a,0x89,0x88,0x87,0x86,0x85,0x84,0x83,0x82,0x81,0x80, \
0x7f,0x7e,0x7d,0x7c,0x7b,0x7a,0x79,0x78,0x77,0x76,0x75,0x74,0x73,0x72,0x71,0x70, \
0x6f,0x6e,0x6d,0x6c,0x6b,0x6a,0x69,0x68,0x67,0x66,0x65,0x64,0x63,0x62,0x61,0x60, \
0x5f,0x5e,0x5d,0x5c,0x5b,0x5a,0x59,0x58,0x57,0x56,0x55,0x54,0x53,0x52,0x51,0x50, \
0x4f,0x4e,0x4d,0x4c,0x4b,0x4a,0x49,0x48,0x47,0x46,0x45,0x44,0x43,0x42,0x41,0x40, \
0x3f,0x3e,0x3d,0x3c,0x3b,0x3a,0x39,0x38,0x37,0x36,0x35,0x34,0x33,0x32,0x31,0x30, \
0x2f,0x2e,0x2d,0x2c,0x2b,0x2a,0x29,0x28,0x27,0x26,0x25,0x24,0x23,0x22,0x21,0x20, \
0x1f,0x1e,0x1d,0x1c,0x1b,0x1a,0x19,0x18,0x17,0x16,0x15,0x14,0x13,0x12,0x11,0x10, \
0x0f,0x0e,0x0d,0x0c,0x0b,0x0a,0x09,0x08,0x07,0x06,0x05,0x04,0x03,0x02,0x01,0x00, \
)
typedef struct{
int32_t x0,x1;
}ivec2;
int32_t max0__ivec2(int32_t nelems, ...){ // The largest component 0 in a list of 2D integer vectors
int32_t max = ~(1ll<<31) + 1; // Assuming two's complement
va_list args;
va_start(args, nelems);
for(int i=0; i<nelems; ++i){
ivec2 a = va_arg(args, ivec2);
max = max > a.x0 ? max : a.x0;
}
va_end(args);
return max;
}
#define max0_ivec2(...) max0__ivec2(m_args_idim(__VA_ARGS__), __VA_ARGS__)
int main(){
int32_t max = max0_ivec2(((ivec2){0,1}), ((ivec2){2,3}, ((ivec2){4,5}), ((ivec2){6,7})));
printf("%d\n", max);
}
I am able to get a list of exported function names and pointers from an executable in windows by using using the PIMAGE_DOS_HEADER API (example).
What is the equivalent API for Linux?
For context I am creating unit test executables and I am exporting functions starting with the name "test_" and I want the executable to just spin through and execute all of the test functions when run.
Example psuedo code:
int main(int argc, char** argv)
{
auto run = new_trun();
auto module = dlopen(NULL);
auto exports = get_exports(module); // <- how do I do this on unix?
for( auto i = 0; i < exports->length; i++)
{
auto export = exports[i];
if(strncmp("test_", export->name, strlen("test_")) == 0)
{
tcase_add(run, export->name, export->func);
}
}
return trun_run(run);
}
EDIT:
I was able to find what I was after using the top answer from this question:
List all the functions/symbols on the fly in C?
Additionally I had to use the gnu_hashtab_symbol_count function from Nominal Animal's answer below to handle the DT_GNU_HASH instead of the DT_HASH.
My final test main function looks like this:
int main(int argc, char** argv)
{
vector<string> symbols;
dl_iterate_phdr(retrieve_symbolnames, &symbols);
TRun run;
auto handle = dlopen(NULL, RTLD_LOCAL | RTLD_LAZY);
for(auto i = symbols.begin(); i != symbols.end(); i++)
{
auto name = *i;
auto func = (testfunc)dlsym(handle, name.c_str());
TCase tcase;
tcase.name = string(name);
tcase.func = func;
run.test_cases.push_back(tcase);
}
return trun_run(&run);
}
Which I then define tests in the assembly like:
// test.h
#define START_TEST(name) extern "C" EXPORT TResult test_##name () {
#define END_TEST return tresult_success(); }
// foo.cc
START_TEST(foo_bar)
{
assert_pending();
}
END_TEST
Which produces output that looks like this:
test_foo_bar: pending
1 pending
0 succeeded
1 total
I do get quite annoyed when I see questions asking how to do something in operating system X that you do in Y.
In most cases, it is not an useful approach, because each operating system (family) tends to have their own approach to issues, so trying to apply something that works in X in Y is like stuffing a cube into a round hole.
Please note: the text here is intended as harsh, not condesceding; my command of the English language is not as good as I'd like. Harshness combined with actual help and pointers to known working solutions seems to work best in overcoming nontechnical limitations, in my experience.
In Linux, a test environment should use something like
LC_ALL=C LANG=C readelf -s FILE
to list all the symbols in FILE. readelf is part of the binutils package, and is installed if you intend to build new binaries on the system. This leads to portable, robust code. Do not forget that Linux encompasses multiple hardware architectures that do have real differences.
To build binaries in Linux, you normally use some of the tools provided in binutils. If binutils provided a library, or there was an ELF library based on the code used in binutils, it would be much better to use that, rather than parse the output of the human utilities. However, there is no such library (the libbfd library binutils uses internally is not ELF-specific). The [URL=http://www.mr511.de/software/english.html]libelf[/URL] library is good, but it is completely separate work by chiefly a single author. Bugs in it have been reported to binutils, which is unproductive, as the two are not related. Simply put, there are no guarantees that it handles the ELF files on a given architecture the same way binutils does. Therefore, for robustness and reliability, you'll definitely want to use binutils.
If you have a test application, it should use a script, say /usr/lib/yourapp/list-test-functions, to list the test-related functions:
#!/bin/bash
export LC_ALL=C LANG=C
for file in "$#" ; do
readelf -s "$file" | while read num value size type bind vix index name dummy ; do
[ "$type" = "FUNC" ] || continue
[ "$bind" = "GLOBAL" ] || continue
[ "$num" = "$[$num]" ] || continue
[ "$index" = "$[$index]" ] || continue
case "$name" in
test_*) printf '%s\n' "$name"
;;
esac
done
done
This way, if there is an architecture that has quirks (in the binutils' readelf output format in particular), you only need to modify the script. Modifying such a simple script is not difficult, and it is easy to verify the script works correctly -- just compare the raw readelf output to the script output; anybody can do that.
A subroutine that constructs a pipe, fork()s a child process, executes the script in the child process, and uses e.g. getline() in the parent process to read the list of names, is quite simple and extremely robust. Since this is also the one fragile spot, we've made it very easy to fix any quirks or problems here by using that external script (that is customizable/extensible to cover those quirks, and easy to debug).
Remember, if binutils itself has bugs (other than output formatting bugs), any binaries built will almost certainly exhibit those same bugs also.
Being a Microsoft-oriented person, you probably will have trouble grasping the benefits of such a modular approach. (It is not specific to Microsoft, but specific to a single-vendor controlled ecosystem where the vendor-pushed approach is via overarching frameworks, and black boxes with clean but very limited interfaces. I think it as the framework limitation, or vendor-enforced walled garden, or prison garden. Looks good, but getting out is difficult. For description and history on the modular approach I'm trying to describe, see for example the Unix philosophy article at Wikipedia.)
The following shows that your approach is indeed possible in Linux, too -- although clunky and fragile; this stuff is intended to be done using the standard tools instead. It's just not the right approach in general.
The interface, symbols.h, is easiest to implement using a callback function that gets called for each symbol found:
#ifndef SYMBOLS_H
#ifndef _GNU_SOURCE
#error You must define _GNU_SOURCE!
#endif
#define SYMBOLS_H
#include <stdlib.h>
typedef enum {
LOCAL_SYMBOL = 1,
GLOBAL_SYMBOL = 2,
WEAK_SYMBOL = 3,
} symbol_bind;
typedef enum {
FUNC_SYMBOL = 4,
OBJECT_SYMBOL = 5,
COMMON_SYMBOL = 6,
THREAD_SYMBOL = 7,
} symbol_type;
int symbols(int (*callback)(const char *libpath, const char *libname, const char *objname,
const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom),
void *custom);
#endif /* SYMBOLS_H */
The ELF symbol binding and type macros are word-size specific, so to avoid the hassle, I declared the enum types above. I omitted some uninteresting types (STT_NOTYPE, STT_SECTION, STT_FILE), however.
The implementation, symbols.c:
#define _GNU_SOURCE
#include <stdlib.h>
#include <limits.h>
#include <string.h>
#include <stdio.h>
#include <fnmatch.h>
#include <dlfcn.h>
#include <link.h>
#include <errno.h>
#include "symbols.h"
#define UINTS_PER_WORD (__WORDSIZE / (CHAR_BIT * sizeof (unsigned int)))
static ElfW(Word) gnu_hashtab_symbol_count(const unsigned int *const table)
{
const unsigned int *const bucket = table + 4 + table[2] * (unsigned int)(UINTS_PER_WORD);
unsigned int b = table[0];
unsigned int max = 0U;
while (b-->0U)
if (bucket[b] > max)
max = bucket[b];
return (ElfW(Word))max;
}
static symbol_bind elf_symbol_binding(const unsigned char st_info)
{
#if __WORDSIZE == 32
switch (ELF32_ST_BIND(st_info)) {
#elif __WORDSIZE == 64
switch (ELF64_ST_BIND(st_info)) {
#else
switch (ELF_ST_BIND(st_info)) {
#endif
case STB_LOCAL: return LOCAL_SYMBOL;
case STB_GLOBAL: return GLOBAL_SYMBOL;
case STB_WEAK: return WEAK_SYMBOL;
default: return 0;
}
}
static symbol_type elf_symbol_type(const unsigned char st_info)
{
#if __WORDSIZE == 32
switch (ELF32_ST_TYPE(st_info)) {
#elif __WORDSIZE == 64
switch (ELF64_ST_TYPE(st_info)) {
#else
switch (ELF_ST_TYPE(st_info)) {
#endif
case STT_OBJECT: return OBJECT_SYMBOL;
case STT_FUNC: return FUNC_SYMBOL;
case STT_COMMON: return COMMON_SYMBOL;
case STT_TLS: return THREAD_SYMBOL;
default: return 0;
}
}
static void *dynamic_pointer(const ElfW(Addr) addr,
const ElfW(Addr) base, const ElfW(Phdr) *const header, const ElfW(Half) headers)
{
if (addr) {
ElfW(Half) h;
for (h = 0; h < headers; h++)
if (header[h].p_type == PT_LOAD)
if (addr >= base + header[h].p_vaddr &&
addr < base + header[h].p_vaddr + header[h].p_memsz)
return (void *)addr;
}
return NULL;
}
struct phdr_iterator_data {
int (*callback)(const char *libpath, const char *libname,
const char *objname, const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom);
void *custom;
};
static int iterate_phdr(struct dl_phdr_info *info, size_t size, void *dataref)
{
struct phdr_iterator_data *const data = dataref;
const ElfW(Addr) base = info->dlpi_addr;
const ElfW(Phdr) *const header = info->dlpi_phdr;
const ElfW(Half) headers = info->dlpi_phnum;
const char *libpath, *libname;
ElfW(Half) h;
if (!data->callback)
return 0;
if (info->dlpi_name && info->dlpi_name[0])
libpath = info->dlpi_name;
else
libpath = "";
libname = strrchr(libpath, '/');
if (libname && libname[0] == '/' && libname[1])
libname++;
else
libname = libpath;
for (h = 0; h < headers; h++)
if (header[h].p_type == PT_DYNAMIC) {
const ElfW(Dyn) *entry = (const ElfW(Dyn) *)(base + header[h].p_vaddr);
const ElfW(Word) *hashtab;
const ElfW(Sym) *symtab = NULL;
const char *strtab = NULL;
ElfW(Word) symbol_count = 0;
for (; entry->d_tag != DT_NULL; entry++)
switch (entry->d_tag) {
case DT_HASH:
hashtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
if (hashtab)
symbol_count = hashtab[1];
break;
case DT_GNU_HASH:
hashtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
if (hashtab) {
ElfW(Word) count = gnu_hashtab_symbol_count(hashtab);
if (count > symbol_count)
symbol_count = count;
}
break;
case DT_STRTAB:
strtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
break;
case DT_SYMTAB:
symtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
break;
}
if (symtab && strtab && symbol_count > 0) {
ElfW(Word) s;
for (s = 0; s < symbol_count; s++) {
const char *name;
void *const ptr = dynamic_pointer(base + symtab[s].st_value, base, header, headers);
symbol_bind bind;
symbol_type type;
int result;
if (!ptr)
continue;
type = elf_symbol_type(symtab[s].st_info);
bind = elf_symbol_binding(symtab[s].st_info);
if (symtab[s].st_name)
name = strtab + symtab[s].st_name;
else
name = "";
result = data->callback(libpath, libname, name, ptr, symtab[s].st_size, bind, type, data->custom);
if (result)
return result;
}
}
}
return 0;
}
int symbols(int (*callback)(const char *libpath, const char *libname, const char *objname,
const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom),
void *custom)
{
struct phdr_iterator_data data;
if (!callback)
return errno = EINVAL;
data.callback = callback;
data.custom = custom;
return errno = dl_iterate_phdr(iterate_phdr, &data);
}
When compiling the above, remember to link against the dl library.
You may find the gnu_hashtab_symbol_count() function above interesting; the format of the table is not well documented anywhere that I can find. This is tested to work on both i386 and x86-64 architectures, but it should be vetted against the GNU sources before relying on it in production code. Again, the better option is to just use those tools directly via a helper script, as they will be installed on any development machine.
Technically, a DT_GNU_HASH table tells us the first dynamic symbol, and the highest index in any hash bucket tells us the last dynamic symbol, but since the entries in the DT_SYMTAB symbol table always begin at 0 (actually, the 0 entry is "none"), I only consider the upper limit.
To match library and function names, I recommend using strncmp() for a prefix match for libraries (match at the start of the library name, up to the first .). Of course, you can use fnmatch() if you prefer glob patterns, or regcomp()+regexec() if you prefer regular expressions (they are built-in to the GNU C library, no external libraries are needed).
Here is an example program, example.c, that just prints out all the symbols:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
#include <errno.h>
#include "symbols.h"
static int my_func(const char *libpath, const char *libname, const char *objname,
const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom __attribute__((unused)))
{
printf("%s (%s):", libpath, libname);
if (*objname)
printf(" %s:", objname);
else
printf(" unnamed");
if (size > 0)
printf(" %zu-byte", size);
if (binding == LOCAL_SYMBOL)
printf(" local");
else
if (binding == GLOBAL_SYMBOL)
printf(" global");
else
if (binding == WEAK_SYMBOL)
printf(" weak");
if (type == FUNC_SYMBOL)
printf(" function");
else
if (type == OBJECT_SYMBOL || type == COMMON_SYMBOL)
printf(" variable");
else
if (type == THREAD_SYMBOL)
printf(" thread-local variable");
printf(" at %p\n", addr);
fflush(stdout);
return 0;
}
int main(int argc, char *argv[])
{
int arg;
for (arg = 1; arg < argc; arg++) {
void *handle = dlopen(argv[arg], RTLD_NOW);
if (!handle) {
fprintf(stderr, "%s: %s.\n", argv[arg], dlerror());
return EXIT_FAILURE;
}
fprintf(stderr, "%s: Loaded.\n", argv[arg]);
}
fflush(stderr);
if (symbols(my_func, NULL))
return EXIT_FAILURE;
return EXIT_SUCCESS;
}
To compile and run the above, use for example
gcc -Wall -O2 -c symbols.c
gcc -Wall -O2 -c example.c
gcc -Wall -O2 example.o symbols.o -ldl -o example
./example | less
To see the symbols in the program itself, use the -rdynamic flag at link time to add all symbols to the dynamic symbol table:
gcc -Wall -O2 -c symbols.c
gcc -Wall -O2 -c example.c
gcc -Wall -O2 -rdynamic example.o symbols.o -ldl -o example
./example | less
On my system, the latter prints out
(): stdout: 8-byte global variable at 0x602080
(): _edata: global at 0x602078
(): __data_start: global at 0x602068
(): data_start: weak at 0x602068
(): symbols: 70-byte global function at 0x401080
(): _IO_stdin_used: 4-byte global variable at 0x401150
(): __libc_csu_init: 101-byte global function at 0x4010d0
(): _start: global function at 0x400a57
(): __bss_start: global at 0x602078
(): main: 167-byte global function at 0x4009b0
(): _init: global function at 0x4008d8
(): stderr: 8-byte global variable at 0x602088
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): unnamed local at 0x7fc652097000
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): unnamed local at 0x7fc652097da0
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): __asprintf: global function at 0x7fc652097000
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): free: global function at 0x7fc652097000
...
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): dlvsym: 118-byte weak function at 0x7fc6520981f0
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): unnamed local at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): unnamed local at 0x7fc651cf14a0
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): unnamed local at 0x7fc65208c740
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _rtld_global: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): __libc_enable_secure: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): __tls_get_addr: global function at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _rtld_global_ro: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _dl_find_dso_for_object: global function at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _dl_starting_up: weak at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _dl_argv: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): putwchar: 292-byte global function at 0x7fc651d4a210
...
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): vwarn: 224-byte global function at 0x7fc651dc8ef0
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): wcpcpy: 39-byte weak function at 0x7fc651d75900
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): unnamed local at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): unnamed local at 0x7fc65229bae0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _dl_get_tls_static_info: 21-byte global function at 0x7fc6522adaa0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): GLIBC_PRIVATE: global variable at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): GLIBC_2.3: global variable at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): GLIBC_2.4: global variable at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): free: 42-byte weak function at 0x7fc6522b2c40
...
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): malloc: 13-byte weak function at 0x7fc6522b2bf0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _dl_allocate_tls_init: 557-byte global function at 0x7fc6522adc00
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _rtld_global_ro: 304-byte global variable at 0x7fc6524bdcc0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): __libc_enable_secure: 4-byte global variable at 0x7fc6524bde68
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _dl_rtld_di_serinfo: 1620-byte global function at 0x7fc6522a4710
I used ... to mark where I removed lots of lines.
Questions?
To get a list of exported symbols from a shared library (a .so) under Linux, there are two ways: the easy one and a slightly harder one.
The easy one is to use the console tools already available: objdump (included in GNU binutils):
$ objdump -T /usr/lib/libid3tag.so.0
00009c15 g DF .text 0000012e Base id3_tag_findframe
00003fac g DF .text 00000053 Base id3_ucs4_utf16duplicate
00008288 g DF .text 000001f2 Base id3_frame_new
00007b73 g DF .text 000003c5 Base id3_compat_fixup
...
The slightly harder way is to use libelf and write a C/C++ program to list the symbols yourself. Have a look at the elfutils package, which is also built from the libelf source. There is a program called eu-readelf (the elfutils version of readelf, not to be confused with the binutils readelf). eu-readelf -s $LIB lists exported symbols using libelf, so you should be able to use that as a starting point.