Related
I want to use an "initcall" system on an ARM CPU in C.
I define these macro :
typedef int (*payload_initcall_t)(payload_state_t *state);
extern payload_initcall_t __payload_initcall_start, __payload_initcall_end;
#define __payload_initcall(fn) \
static payload_initcall_t __payload_initcall_##fn __payload_init_call = fn
#define __payload_init_call __attribute__ ((unused,__section__ (".payload_init_ptrs")))
#define payload_init(x) __payload_initcall(x);
#define __init __attribute__ ((__section__ (".code_segment")))
An example of module :
static int __init inspec_init(payload_state_t *state) {
payload_log(LOG_INFO, "I AM THE PAYLOAD 1 !!");
return 0;
}
payload_init(inspec_init)
The function which calls the modules :
payload_initcall_t *call_p;
call_p = &__payload_initcall_start;
do {
payload_log(LOG_DEBUG, "payload init fct call: %p", (void *)call_p);
(*call_p)(state);
++call_p;
} while (call_p < &__payload_initcall_end);
In the linker file (just after .text section :
__payload_initcall_start = .;
payload_init_ptrs : { *(.payload_init_ptrs) }
__payload_initcall_end = .;
code_segment : { *(.code_segment) }
In my x86_64 computer it works.
I cross compile this code with a buildroot toolchain (GCC 7.4.0). The compilation terminates with no error but when I run the binary on the ARM target it segfault at the "(*call_p)(state)" line.
When I run objdump on the x86_64 binary I find "code_segment" and "payload_init_ptrs" but in the ARM binary I can't find these sections.
Is my GCC outdated ?
Thanks in advance.
I have a binary file (ELF) that I don't write, but I want to use 1 function from this binary (I know the address/offset of the function), that function not exported from the binary.
My goal is to call this function from my C code that I write and compile this function statically in my binary (I compile with gcc).
How can I do that please?
I am going to answer the
call to this function from my c code that I write
part.
The below works under certain assumptions, like dynamic linking and position independent code. I haven't thought for too long about what happens if they are broken (let's experiment/discuss, if there's interest).
$ cat lib.c
int data = 42;
static int foo () { return data; }
gcc -fpic -shared lib.c -o lib.so
$ nm lib.so | grep foo
00000000000010e9 t foo
The above reproduces having the address that you know. The address we know now is 0x10e9. It is the virtual address of foo before relocation. We'll model the relocation the dynamic loader does by hand by simply adding the base address at which lib.so gets loaded.
$ cat 1.c
#define _GNU_SOURCE
#include <stdio.h>
#include <link.h>
#include <string.h>
#include <elf.h>
#define FOO_VADDR 0x10e9
typedef int(*func_t)();
int callback(struct dl_phdr_info *info, size_t size, void *data)
{
if (!(strstr(info->dlpi_name, "lib.so")))
return 0;
Elf64_Addr addr = info->dlpi_addr + FOO_VADDR;
func_t f = (func_t)addr;
int res = f();
printf("res = %d\n", res);
return 0;
}
int main()
{
void *handle = dlopen("./lib.so", RTLD_LAZY);
if (!handle) {
puts("failed to load");
return 1;
}
dl_iterate_phdr(&callback, NULL);
dlclose(handle);
return 0;
}
And now...
$ gcc 1.c -ldl && ./a.out
res = 42
Voila -- it worked! That was fun.
Credit: this was helpful.
If you have questions, feel free to read the man and ask in the comments.
As for
compile this function statically in my binary
I don't know off the bat. This would be trickier. Why do you want that? Also, do you know whether the function depends on some data (or maybe it calls other functions) in the original ELF file, like in the example above?
I built a shell that tries to make the tab (\t) key do something custom using rl_bind_key(), but it didn't work in macOS Sierra, but it works on Ubuntu, Fedora, and CentOS. Here's the mcve:
#include <stdlib.h>
#include <stdio.h>
#include <readline/readline.h>
static int cmd_complete(int count, int key)
{
printf("\nCustom tab action goes here...\n");
rl_forced_update_display();
return 0;
}
char *interactive_input()
{
char *buffer = readline(" > ");
return buffer;
}
int main(int argc, char **argv)
{
rl_bind_key('\t', cmd_complete); // this doesn't seem to work in macOS
char *buffer = 0;
while (!buffer || strncmp(buffer, "exit", 4)) {
if (buffer) { free(buffer); buffer=0; }
// get command
buffer = interactive_input();
printf("awesome command: %s\n", buffer);
}
free(buffer);
return 0;
}
I compile using Clang like this:
$ cc -lreadline cli.c -o cli
What is the cause of this behavior and how do I fix it?
I was using the flag -lreadline, however unbeknownst to me, Clang appears to secretly use libedit (I've seen it called editline also). In libedit, for some reason (which merits another question), rl_bind_key appears to not work with anything except rl_insert.
So one solution that I found is to use Homebrew to install GNU Readline (brew install readline), and then to ensure I use that version, I compile thusly:
$ cc -lreadline cli.c -o cli -L/usr/local/opt/readline/lib -I/usr/local/opt/readline/include
In fact, when you install readline, it will tell you this at the end of the installation or if you do brew info readline:
gns-mac1:~ gns$ brew info readline
readline: stable 7.0.3 (bottled) [keg-only]
Library for command-line editing
https://tiswww.case.edu/php/chet/readline/rltop.html
/usr/local/Cellar/readline/7.0.3_1 (46 files, 1.5MB)
Poured from bottle on 2017-10-24 at 12:21:35
From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/readline.rb
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local,
because macOS provides the BSD libedit library, which shadows libreadline.
In order to prevent conflicts when programs look for libreadline we are
defaulting this GNU Readline installation to keg-only..
For compilers to find this software you may need to set:
LDFLAGS: -L/usr/local/opt/readline/lib
CPPFLAGS: -I/usr/local/opt/readline/include
Source of libedit rl_bind_key
So this is why it doesn't work in libedit. I downloaded the source and this is how the rl_bind_key function is defined:
/*
* bind key c to readline-type function func
*/
int
rl_bind_key(int c, rl_command_func_t *func)
{
int retval = -1;
if (h == NULL || e == NULL)
rl_initialize();
if (func == rl_insert) {
/* XXX notice there is no range checking of ``c'' */
e->el_map.key[c] = ED_INSERT;
retval = 0;
}
return retval;
}
So it seems designed to not work with anything except rl_insert. That seems like a bug, not a feature. I wish I knew how to become a contributor to libedit.
I am able to get a list of exported function names and pointers from an executable in windows by using using the PIMAGE_DOS_HEADER API (example).
What is the equivalent API for Linux?
For context I am creating unit test executables and I am exporting functions starting with the name "test_" and I want the executable to just spin through and execute all of the test functions when run.
Example psuedo code:
int main(int argc, char** argv)
{
auto run = new_trun();
auto module = dlopen(NULL);
auto exports = get_exports(module); // <- how do I do this on unix?
for( auto i = 0; i < exports->length; i++)
{
auto export = exports[i];
if(strncmp("test_", export->name, strlen("test_")) == 0)
{
tcase_add(run, export->name, export->func);
}
}
return trun_run(run);
}
EDIT:
I was able to find what I was after using the top answer from this question:
List all the functions/symbols on the fly in C?
Additionally I had to use the gnu_hashtab_symbol_count function from Nominal Animal's answer below to handle the DT_GNU_HASH instead of the DT_HASH.
My final test main function looks like this:
int main(int argc, char** argv)
{
vector<string> symbols;
dl_iterate_phdr(retrieve_symbolnames, &symbols);
TRun run;
auto handle = dlopen(NULL, RTLD_LOCAL | RTLD_LAZY);
for(auto i = symbols.begin(); i != symbols.end(); i++)
{
auto name = *i;
auto func = (testfunc)dlsym(handle, name.c_str());
TCase tcase;
tcase.name = string(name);
tcase.func = func;
run.test_cases.push_back(tcase);
}
return trun_run(&run);
}
Which I then define tests in the assembly like:
// test.h
#define START_TEST(name) extern "C" EXPORT TResult test_##name () {
#define END_TEST return tresult_success(); }
// foo.cc
START_TEST(foo_bar)
{
assert_pending();
}
END_TEST
Which produces output that looks like this:
test_foo_bar: pending
1 pending
0 succeeded
1 total
I do get quite annoyed when I see questions asking how to do something in operating system X that you do in Y.
In most cases, it is not an useful approach, because each operating system (family) tends to have their own approach to issues, so trying to apply something that works in X in Y is like stuffing a cube into a round hole.
Please note: the text here is intended as harsh, not condesceding; my command of the English language is not as good as I'd like. Harshness combined with actual help and pointers to known working solutions seems to work best in overcoming nontechnical limitations, in my experience.
In Linux, a test environment should use something like
LC_ALL=C LANG=C readelf -s FILE
to list all the symbols in FILE. readelf is part of the binutils package, and is installed if you intend to build new binaries on the system. This leads to portable, robust code. Do not forget that Linux encompasses multiple hardware architectures that do have real differences.
To build binaries in Linux, you normally use some of the tools provided in binutils. If binutils provided a library, or there was an ELF library based on the code used in binutils, it would be much better to use that, rather than parse the output of the human utilities. However, there is no such library (the libbfd library binutils uses internally is not ELF-specific). The [URL=http://www.mr511.de/software/english.html]libelf[/URL] library is good, but it is completely separate work by chiefly a single author. Bugs in it have been reported to binutils, which is unproductive, as the two are not related. Simply put, there are no guarantees that it handles the ELF files on a given architecture the same way binutils does. Therefore, for robustness and reliability, you'll definitely want to use binutils.
If you have a test application, it should use a script, say /usr/lib/yourapp/list-test-functions, to list the test-related functions:
#!/bin/bash
export LC_ALL=C LANG=C
for file in "$#" ; do
readelf -s "$file" | while read num value size type bind vix index name dummy ; do
[ "$type" = "FUNC" ] || continue
[ "$bind" = "GLOBAL" ] || continue
[ "$num" = "$[$num]" ] || continue
[ "$index" = "$[$index]" ] || continue
case "$name" in
test_*) printf '%s\n' "$name"
;;
esac
done
done
This way, if there is an architecture that has quirks (in the binutils' readelf output format in particular), you only need to modify the script. Modifying such a simple script is not difficult, and it is easy to verify the script works correctly -- just compare the raw readelf output to the script output; anybody can do that.
A subroutine that constructs a pipe, fork()s a child process, executes the script in the child process, and uses e.g. getline() in the parent process to read the list of names, is quite simple and extremely robust. Since this is also the one fragile spot, we've made it very easy to fix any quirks or problems here by using that external script (that is customizable/extensible to cover those quirks, and easy to debug).
Remember, if binutils itself has bugs (other than output formatting bugs), any binaries built will almost certainly exhibit those same bugs also.
Being a Microsoft-oriented person, you probably will have trouble grasping the benefits of such a modular approach. (It is not specific to Microsoft, but specific to a single-vendor controlled ecosystem where the vendor-pushed approach is via overarching frameworks, and black boxes with clean but very limited interfaces. I think it as the framework limitation, or vendor-enforced walled garden, or prison garden. Looks good, but getting out is difficult. For description and history on the modular approach I'm trying to describe, see for example the Unix philosophy article at Wikipedia.)
The following shows that your approach is indeed possible in Linux, too -- although clunky and fragile; this stuff is intended to be done using the standard tools instead. It's just not the right approach in general.
The interface, symbols.h, is easiest to implement using a callback function that gets called for each symbol found:
#ifndef SYMBOLS_H
#ifndef _GNU_SOURCE
#error You must define _GNU_SOURCE!
#endif
#define SYMBOLS_H
#include <stdlib.h>
typedef enum {
LOCAL_SYMBOL = 1,
GLOBAL_SYMBOL = 2,
WEAK_SYMBOL = 3,
} symbol_bind;
typedef enum {
FUNC_SYMBOL = 4,
OBJECT_SYMBOL = 5,
COMMON_SYMBOL = 6,
THREAD_SYMBOL = 7,
} symbol_type;
int symbols(int (*callback)(const char *libpath, const char *libname, const char *objname,
const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom),
void *custom);
#endif /* SYMBOLS_H */
The ELF symbol binding and type macros are word-size specific, so to avoid the hassle, I declared the enum types above. I omitted some uninteresting types (STT_NOTYPE, STT_SECTION, STT_FILE), however.
The implementation, symbols.c:
#define _GNU_SOURCE
#include <stdlib.h>
#include <limits.h>
#include <string.h>
#include <stdio.h>
#include <fnmatch.h>
#include <dlfcn.h>
#include <link.h>
#include <errno.h>
#include "symbols.h"
#define UINTS_PER_WORD (__WORDSIZE / (CHAR_BIT * sizeof (unsigned int)))
static ElfW(Word) gnu_hashtab_symbol_count(const unsigned int *const table)
{
const unsigned int *const bucket = table + 4 + table[2] * (unsigned int)(UINTS_PER_WORD);
unsigned int b = table[0];
unsigned int max = 0U;
while (b-->0U)
if (bucket[b] > max)
max = bucket[b];
return (ElfW(Word))max;
}
static symbol_bind elf_symbol_binding(const unsigned char st_info)
{
#if __WORDSIZE == 32
switch (ELF32_ST_BIND(st_info)) {
#elif __WORDSIZE == 64
switch (ELF64_ST_BIND(st_info)) {
#else
switch (ELF_ST_BIND(st_info)) {
#endif
case STB_LOCAL: return LOCAL_SYMBOL;
case STB_GLOBAL: return GLOBAL_SYMBOL;
case STB_WEAK: return WEAK_SYMBOL;
default: return 0;
}
}
static symbol_type elf_symbol_type(const unsigned char st_info)
{
#if __WORDSIZE == 32
switch (ELF32_ST_TYPE(st_info)) {
#elif __WORDSIZE == 64
switch (ELF64_ST_TYPE(st_info)) {
#else
switch (ELF_ST_TYPE(st_info)) {
#endif
case STT_OBJECT: return OBJECT_SYMBOL;
case STT_FUNC: return FUNC_SYMBOL;
case STT_COMMON: return COMMON_SYMBOL;
case STT_TLS: return THREAD_SYMBOL;
default: return 0;
}
}
static void *dynamic_pointer(const ElfW(Addr) addr,
const ElfW(Addr) base, const ElfW(Phdr) *const header, const ElfW(Half) headers)
{
if (addr) {
ElfW(Half) h;
for (h = 0; h < headers; h++)
if (header[h].p_type == PT_LOAD)
if (addr >= base + header[h].p_vaddr &&
addr < base + header[h].p_vaddr + header[h].p_memsz)
return (void *)addr;
}
return NULL;
}
struct phdr_iterator_data {
int (*callback)(const char *libpath, const char *libname,
const char *objname, const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom);
void *custom;
};
static int iterate_phdr(struct dl_phdr_info *info, size_t size, void *dataref)
{
struct phdr_iterator_data *const data = dataref;
const ElfW(Addr) base = info->dlpi_addr;
const ElfW(Phdr) *const header = info->dlpi_phdr;
const ElfW(Half) headers = info->dlpi_phnum;
const char *libpath, *libname;
ElfW(Half) h;
if (!data->callback)
return 0;
if (info->dlpi_name && info->dlpi_name[0])
libpath = info->dlpi_name;
else
libpath = "";
libname = strrchr(libpath, '/');
if (libname && libname[0] == '/' && libname[1])
libname++;
else
libname = libpath;
for (h = 0; h < headers; h++)
if (header[h].p_type == PT_DYNAMIC) {
const ElfW(Dyn) *entry = (const ElfW(Dyn) *)(base + header[h].p_vaddr);
const ElfW(Word) *hashtab;
const ElfW(Sym) *symtab = NULL;
const char *strtab = NULL;
ElfW(Word) symbol_count = 0;
for (; entry->d_tag != DT_NULL; entry++)
switch (entry->d_tag) {
case DT_HASH:
hashtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
if (hashtab)
symbol_count = hashtab[1];
break;
case DT_GNU_HASH:
hashtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
if (hashtab) {
ElfW(Word) count = gnu_hashtab_symbol_count(hashtab);
if (count > symbol_count)
symbol_count = count;
}
break;
case DT_STRTAB:
strtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
break;
case DT_SYMTAB:
symtab = dynamic_pointer(entry->d_un.d_ptr, base, header, headers);
break;
}
if (symtab && strtab && symbol_count > 0) {
ElfW(Word) s;
for (s = 0; s < symbol_count; s++) {
const char *name;
void *const ptr = dynamic_pointer(base + symtab[s].st_value, base, header, headers);
symbol_bind bind;
symbol_type type;
int result;
if (!ptr)
continue;
type = elf_symbol_type(symtab[s].st_info);
bind = elf_symbol_binding(symtab[s].st_info);
if (symtab[s].st_name)
name = strtab + symtab[s].st_name;
else
name = "";
result = data->callback(libpath, libname, name, ptr, symtab[s].st_size, bind, type, data->custom);
if (result)
return result;
}
}
}
return 0;
}
int symbols(int (*callback)(const char *libpath, const char *libname, const char *objname,
const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom),
void *custom)
{
struct phdr_iterator_data data;
if (!callback)
return errno = EINVAL;
data.callback = callback;
data.custom = custom;
return errno = dl_iterate_phdr(iterate_phdr, &data);
}
When compiling the above, remember to link against the dl library.
You may find the gnu_hashtab_symbol_count() function above interesting; the format of the table is not well documented anywhere that I can find. This is tested to work on both i386 and x86-64 architectures, but it should be vetted against the GNU sources before relying on it in production code. Again, the better option is to just use those tools directly via a helper script, as they will be installed on any development machine.
Technically, a DT_GNU_HASH table tells us the first dynamic symbol, and the highest index in any hash bucket tells us the last dynamic symbol, but since the entries in the DT_SYMTAB symbol table always begin at 0 (actually, the 0 entry is "none"), I only consider the upper limit.
To match library and function names, I recommend using strncmp() for a prefix match for libraries (match at the start of the library name, up to the first .). Of course, you can use fnmatch() if you prefer glob patterns, or regcomp()+regexec() if you prefer regular expressions (they are built-in to the GNU C library, no external libraries are needed).
Here is an example program, example.c, that just prints out all the symbols:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
#include <errno.h>
#include "symbols.h"
static int my_func(const char *libpath, const char *libname, const char *objname,
const void *addr, const size_t size,
const symbol_bind binding, const symbol_type type,
void *custom __attribute__((unused)))
{
printf("%s (%s):", libpath, libname);
if (*objname)
printf(" %s:", objname);
else
printf(" unnamed");
if (size > 0)
printf(" %zu-byte", size);
if (binding == LOCAL_SYMBOL)
printf(" local");
else
if (binding == GLOBAL_SYMBOL)
printf(" global");
else
if (binding == WEAK_SYMBOL)
printf(" weak");
if (type == FUNC_SYMBOL)
printf(" function");
else
if (type == OBJECT_SYMBOL || type == COMMON_SYMBOL)
printf(" variable");
else
if (type == THREAD_SYMBOL)
printf(" thread-local variable");
printf(" at %p\n", addr);
fflush(stdout);
return 0;
}
int main(int argc, char *argv[])
{
int arg;
for (arg = 1; arg < argc; arg++) {
void *handle = dlopen(argv[arg], RTLD_NOW);
if (!handle) {
fprintf(stderr, "%s: %s.\n", argv[arg], dlerror());
return EXIT_FAILURE;
}
fprintf(stderr, "%s: Loaded.\n", argv[arg]);
}
fflush(stderr);
if (symbols(my_func, NULL))
return EXIT_FAILURE;
return EXIT_SUCCESS;
}
To compile and run the above, use for example
gcc -Wall -O2 -c symbols.c
gcc -Wall -O2 -c example.c
gcc -Wall -O2 example.o symbols.o -ldl -o example
./example | less
To see the symbols in the program itself, use the -rdynamic flag at link time to add all symbols to the dynamic symbol table:
gcc -Wall -O2 -c symbols.c
gcc -Wall -O2 -c example.c
gcc -Wall -O2 -rdynamic example.o symbols.o -ldl -o example
./example | less
On my system, the latter prints out
(): stdout: 8-byte global variable at 0x602080
(): _edata: global at 0x602078
(): __data_start: global at 0x602068
(): data_start: weak at 0x602068
(): symbols: 70-byte global function at 0x401080
(): _IO_stdin_used: 4-byte global variable at 0x401150
(): __libc_csu_init: 101-byte global function at 0x4010d0
(): _start: global function at 0x400a57
(): __bss_start: global at 0x602078
(): main: 167-byte global function at 0x4009b0
(): _init: global function at 0x4008d8
(): stderr: 8-byte global variable at 0x602088
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): unnamed local at 0x7fc652097000
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): unnamed local at 0x7fc652097da0
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): __asprintf: global function at 0x7fc652097000
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): free: global function at 0x7fc652097000
...
/lib/x86_64-linux-gnu/libdl.so.2 (libdl.so.2): dlvsym: 118-byte weak function at 0x7fc6520981f0
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): unnamed local at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): unnamed local at 0x7fc651cf14a0
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): unnamed local at 0x7fc65208c740
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _rtld_global: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): __libc_enable_secure: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): __tls_get_addr: global function at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _rtld_global_ro: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _dl_find_dso_for_object: global function at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _dl_starting_up: weak at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): _dl_argv: global variable at 0x7fc651cd2000
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): putwchar: 292-byte global function at 0x7fc651d4a210
...
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): vwarn: 224-byte global function at 0x7fc651dc8ef0
/lib/x86_64-linux-gnu/libc.so.6 (libc.so.6): wcpcpy: 39-byte weak function at 0x7fc651d75900
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): unnamed local at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): unnamed local at 0x7fc65229bae0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _dl_get_tls_static_info: 21-byte global function at 0x7fc6522adaa0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): GLIBC_PRIVATE: global variable at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): GLIBC_2.3: global variable at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): GLIBC_2.4: global variable at 0x7fc65229b000
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): free: 42-byte weak function at 0x7fc6522b2c40
...
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): malloc: 13-byte weak function at 0x7fc6522b2bf0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _dl_allocate_tls_init: 557-byte global function at 0x7fc6522adc00
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _rtld_global_ro: 304-byte global variable at 0x7fc6524bdcc0
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): __libc_enable_secure: 4-byte global variable at 0x7fc6524bde68
/lib64/ld-linux-x86-64.so.2 (ld-linux-x86-64.so.2): _dl_rtld_di_serinfo: 1620-byte global function at 0x7fc6522a4710
I used ... to mark where I removed lots of lines.
Questions?
To get a list of exported symbols from a shared library (a .so) under Linux, there are two ways: the easy one and a slightly harder one.
The easy one is to use the console tools already available: objdump (included in GNU binutils):
$ objdump -T /usr/lib/libid3tag.so.0
00009c15 g DF .text 0000012e Base id3_tag_findframe
00003fac g DF .text 00000053 Base id3_ucs4_utf16duplicate
00008288 g DF .text 000001f2 Base id3_frame_new
00007b73 g DF .text 000003c5 Base id3_compat_fixup
...
The slightly harder way is to use libelf and write a C/C++ program to list the symbols yourself. Have a look at the elfutils package, which is also built from the libelf source. There is a program called eu-readelf (the elfutils version of readelf, not to be confused with the binutils readelf). eu-readelf -s $LIB lists exported symbols using libelf, so you should be able to use that as a starting point.
From the documentation: http://luajit.org/running.html
luajit -b test.lua test.obj # Generate object file
# Link test.obj with your application and load it with require("test")
But doesn't explain how to do these things. I guess they're assuming anyone using Lua is also a C programmer, not the case with me! Can I get some help? GCC as an example.
I would also like to do the same thing except from the C byte array header. I can't find documentation on this either.
luajit -bt h -n test test.lua test.h
This creates the header file but I don't know how to run it from C. Thanks.
main.lua
print("Hello from main.lua")
app.c
#include <stdio.h>
#include "lua.h"
#include "lauxlib.h"
#include "lualib.h"
int main(int argc, char **argv)
{
int status;
lua_State *L = luaL_newstate();
luaL_openlibs(L);
lua_getglobal(L, "require");
lua_pushliteral(L, "main");
status = lua_pcall(L, 1, 0, 0);
if (status) {
fprintf(stderr, "Error: %s\n", lua_tostring(L, -1));
return 1;
}
return 0;
}
Shell commands:
luajit -b main.lua main.o
gcc -O2 -Wall -Wl,-E -o app app.c main.o -Ixx -Lxx -lluajit-5.1 -lm -ldl
Replace -Ixx and -Lxx by the LuaJIT include and library directories. If you've installed it in /usr/local (the default), then most GCC installations will find it without these two options.
The first command compiles the Lua source code to bytecode and embeds it into the object file main.o.
The second command compiles and links the minimal C application code. Note that it links in the embedded bytecode, too. The -Wl,-E is mandatory (on Linux) to export all symbols from the executable.
Now move the original main.lua away (to ensure it's really running the embedded bytecode and not the Lua source code file) and then run your app:
mv main.lua main.lua.orig
./app
# Output: Hello from main.lua
The basic usage is as follows:
Generate the header file using luajit
#include that header in the source file(s) that's going to be referencing its symbols
Compile the source into a runnable executable or shared binary module for lua depending on your use-case.
Here's a minimal example to illustrate:
test.lua
return
{
fooprint = function (s) return print("from foo: "..s) end,
barprint = function (s) return print("from bar: "..s) end
}
test.h
// luajit -b test.lua test.h
#define luaJIT_BC_test_SIZE 155
static const char luaJIT_BC_test[] = {
27,76,74,1,2,44,0,1,4,0,2,0,5,52,1,0,0,37,2,1,0,16,3,0,0,36,2,3,2,64,1,2,0,15,
102,114,111,109,32,102,111,111,58,32,10,112,114,105,110,116,44,0,1,4,0,2,0,5,
52,1,0,0,37,2,1,0,16,3,0,0,36,2,3,2,64,1,2,0,15,102,114,111,109,32,98,97,114,
58,32,10,112,114,105,110,116,58,3,0,2,0,5,0,7,51,0,1,0,49,1,0,0,58,1,2,0,49,1,
3,0,58,1,4,0,48,0,0,128,72,0,2,0,13,98,97,114,112,114,105,110,116,0,13,102,
111,111,112,114,105,110,116,1,0,0,0,0
};
runtest.cpp
// g++ -Wall -pedantic -g runtest.cpp -o runtest.exe -llua51
#include <stdio.h>
#include <assert.h>
#include "lua.hpp"
#include "test.h"
static const char *runtest =
"test = require 'test'\n"
"test.fooprint('it works!')\n"
"test.barprint('it works!')\n";
int main()
{
lua_State *L = luaL_newstate();
luaL_openlibs(L);
lua_getglobal(L, "package");
lua_getfield(L, -1, "preload");
// package, preload, luaJIT_BC_test
bool err = luaL_loadbuffer(L, luaJIT_BC_test, luaJIT_BC_test_SIZE, NULL);
assert(!err);
// package.preload.test = luaJIT_BC_test
lua_setfield(L, -2, "test");
// check that 'test' lib is now available; run the embedded test script
lua_settop(L, 0);
err = luaL_dostring(L, runtest);
assert(!err);
lua_close(L);
}
This is pretty straight-forward. This example takes the byte-code and places it into the package.preload table for this program's lua environment. Other lua scripts can then use this by doing require 'test'. The embedded lua source in runtest does exactly this and outputs:
from foo: it works!
from bar: it works!