Ubuntu C program loadable module and undefined symbols - c

Very new to linux in general and trying to build a loadable module for use in zabbix, which works, but trying to build a simple shell program for testing it. That means this module needs to be loaded dynamically.
Sharable module SNMPmath is built with this:
gcc -shared -o SNMPmath.so $(CFLAGS) -I../../../include -I/usr/include/libxml2 -fPIC SNMPmath.c
That works fine for zabbix.
The test program (TestSO.c) uses
lib_handle = dlopen("./SNMPmath.so", RTLD_NOW);
to load this image dynamically, and when it does, it is missing symbols including init_snmp which come from the net-snmp package which is referenced in the SNMPmath loadable module.
My question is both general and specific. What's the right approach -- is this library called by the loadable module supposed to be forced into the loadable module? Is it supposed to be forced into the test program (despite having no compile-time reference to it)? Or is it supposed to be dynamically loaded, itself, in the loadable module (which seems to contradict what I'm seeing in other examples)?
And in either case, how is the GCC command modified to include net-snmp? I've tried variations of whole-archive, no-as-needed, listing what I think is the library (/usr/lib/x86_64-linux-gnu/libsnmp.a) with various compiler options with no effect (or occasionally errors). Also tried linking to the .so version of that (no effect). So a hint as to the proper GCC command to include the library would be very helpful.
Here's one iteration of attempts at linking the main program:
gcc -rdynamic -o TestSO -I../../../include -I/usr/include/libxml2 TestSO.c -ldl -Wl,--no-as-needed /usr/lib/x86_64-linux-gnu/libsnmp.so
I've found numerous examples of loading modules, but they all load a simple routine that does not itself have undefined symbols that need satisfying.
Recap:
TestSO.c
==> Loads with dlopen SNMPmath.c
==> needs to refer to net-snmp routines like init_snmp
Pointers to examples or explanation welcome, I realize I'm missing something fairly obvious.
EDIT AFTER FIRST COMMENTS TO INCLUDE:
I have it sort of working now, but would appreciate a sanity check if this is correct. I pruned it down so as to show the whole code. Here is the code to produce the SO:
#include <net-snmp/net-snmp-config.h>
#include <net-snmp/net-snmp-includes.h>
#include <string.h>
int zbx_module_SNMPmath_avg(int i, int j)
{
init_snmp("snmpapp"); // initialize SNMP library
return 1;
}
Here is how it is compiled (note slight name change):
CFLAGS=-I. `net-snmp-config --cflags`
SNMPmath_small: SNMPmath_small.c
gcc -shared -o SNMPmath.so $(CFLAGS) -I../../../include -I/usr/include/libxml2 -fPIC SNMPmath_small.c -Wl,--no-as-needed,/usr/lib/x86_64-linux-gnu/libsnmp.so
Then here is the main program:
#include <stdio.h>
#include <dlfcn.h>
#include <dlfcn.h>
#include <string.h>
int main(int argc, char **argv)
{
void *lib_handle;
int(*fn)(int req, int ret);
int x;
char *error;
lib_handle = dlopen("./SNMPmath.so", RTLD_NOW);
if (!lib_handle)
{
fprintf(stderr, "Error on open - %s\n", dlerror());
exit(1);
}
else
fprintf(stderr,"Successfully loaded module\n");
fn = dlsym(lib_handle, "zbx_module_SNMPmath_avg");
if ((error = dlerror()) != NULL)
{
fprintf(stderr, "Error on dlsym %s\n", error);
exit(1);
}
else fprintf(stderr,"Successfully called dlsym\n");
// testing
int req, ret;
req=1;
ret=1;
x=(*fn)(req, ret);
printf("Valx=%i\n",x);
dlclose(lib_handle);
return 0;
}
And finally this is built with:
TestSO: TestSO.c
gcc -rdynamic -o TestSO -I../../../include -I/usr/include/libxml2 TestSO.c -ldl
This now will run as expected. I found that linking against the netsnmp library's so file when building my so seemed to work.
But is this the correct sequence? And/or the preferred sequence?
Ps. Off to read the paper in the first proposed answer.

Pointer to explanation: Drepper's paper: Howto write shared libraries
But we need more source code (and the commands used to compile it), and the real error messages (e.g. as given by dlerror() after dlopen call) to help more.
(See also this answer)
You might want to add some libraries like -lsomething (I don't know which, you probably do know!) to your command gcc -shared which is building SNMPmath.so .... You don't want to link a static library like libsnmp.a to it (you should link shared libraries to your SNMPmath.so shared object).

Related

Make a C static library that only exposes the API

I have an old static library that includes some functions produced by a C auto-coder (Mathworks/Simulink). I have just made a new static library (that does totally different things from the old library) that uses a newer version of the auto-coder. I need to make an executable that links both these libraries. Unfortunately many of the helper functions generated by the auto-coder have the same name in the old and new libraries but their functionality has changed. These functions are not part of the API. Obviously when I try to link I get complaints that the same symbols appear in both libraries.
The old library is untouchable, but I can mess with the new one.
The libraries are built with
gcc -c *.c
ar -crs libwhatever.a *.o
and linked with
gcc main.o -L. -lold -lnew
My current solution is to look at all the symbol clashes given by gcc and rename the symbols in the new library using
objcopy --redefine-sym sym=sym_new libnew.a
This seems to work. My program links and behaves as expected. But I'm wondering if it is the right/best thing to do.
Ideally I'd like to make a libnew.a that only exposes the API functions and nothing else. That way I guarantee no symbol clashes when I try to link. Is this possible I cannot figure out how.
Please note that I have little previous experience in these matters. Also, if it's relevant/not obvious, I'm using Linux.
Edit
Below is a rather contrived minimal working example.
old/printOld.c:
#include "printOld.h"
#include <stdio.h>
void printOld()
{
printStr();
}
void printStr()
{
printf("Old\n");
}
old/printOld.h:
void printStr();
old/buildOld:
gcc -c printOld.c
ar -crs libold.a printOld.o
new/printNew.c:
#include "printNew.h"
#include <stdio.h>
void printNew()
{
printStr();
}
void printStr()
{
printf("New\n");
}
new/printNew.h:
void printStr();
new/buildNew:
gcc -c printNew.c
ar -crs libnew.a printNew.o
#objcopy --redefine-sym printStr=printStrNew libnew.a
main/main.c:
void printOld();
void printNew();
int main()
{
printOld();
printNew();
return 0;
}
main/buildMain:
gcc main.c -L../old -L../new -lold -lnew
The linker error when I leave objcopy commented out is:
/usr/bin/ld: ../new/libnew.a(printNew.o): in function `printStr':
printNew.c:(.text+0x11): multiple definition of `printStr'; ../old/libold.a(printOld.o):printOld.c:(.text+0x11): first defined here
collect2: error: ld returned 1 exit status
The above example has the same kind of structure as the auto-generated code. I know that I can fix my problem by declaring the functions in the header files as static, but I'd rather not have to mess with the auto-generated code. It is very large and very unpleasant.

When to actually use dlopen()? Does dlopen() means dynamic loading?

I have gone through below link, through which I understood how to create and use shared library.
https://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html
Step 1: Compiling with Position Independent Code
$ gcc -c -Wall -Werror -fpic foo.c
Step 2: Creating a shared library from an object file
$ gcc -shared -o libfoo.so foo.o
Step 3: Linking with a shared library
$ gcc -L/home/username/foo -Wall -o test main.c -lfoo
Step 4: Making the library available at runtime
$ export LD_LIBRARY_PATH=/home/username/foo:$LD_LIBRARY_PATH
$ ./test
This is a shared library test...
Hello, I am a shared library
However, I have couple of questions:
In the given link, there is no usage of dlopen(), which is required to open shared library. How this code is working without dlopen() calls?
When to actually use dlopen()?
Can I compile program without having .so file available (program has function calls to that shared library)?
Does the dlopen() means dynamic loading and example in above link(step 3) means static loading? If Yes, then in case of dynamic loading, is there any difference in linking step(step 3)?
Thanks in advance.
Dynamic Library
When we link an application against a shared library, the linker leaves some stubs (unresolved symbols) to be filled at application loading time. These stubs need to be filled by a tool called, dynamic linker at run time or at application loading time.
Loading of a shared library is of two types,
Dynamically linked libraries –
Here a program is linked with the shared library and the kernel loads the library (in case it’s not in memory) upon execution.
This is explained in mentioned link.
Dynamically loaded libraries –
Useful for creating a "plug-in" architecture.
As the name indicates, dynamic loading is about loading of library on demand and linked during execution.
The program takes full control by calling functions with the library.
This is done using dlopen(), dlsym(), dlclose().
The dlopen() function opens a library and prepares it for use.
With this system call it is possible to open a shared library and use the functions from it, without having to link with it. Your program just starts, and when it finds out that it needs to use a function from a specific library, it calls dlopen() to open that library. If the library is not available on the system, the function returns NULL and it is up to you, the programmer, to handle that. You can let the program gracefully die.
DL Example:
This example loads the math library and prints the cosine of 2.0, and it checks for errors at every step (recommended):
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
int main(int argc, char **argv) {
void *handle;
double (*cosine)(double);
char *error;
handle = dlopen ("/lib/libm.so.6", RTLD_LAZY);
if (!handle) {
fputs (dlerror(), stderr);
exit(1);
}
cosine = dlsym(handle, "cos");
if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(1);
}
printf ("%f\n", (*cosine)(2.0));
dlclose(handle);
}
Use -rdynamic while compiling DL source code.
Ex. gcc -rdynamic -o progdl progdl.c -ldl

GCC link against .so file without souce code

I am trying to compile compile a simple "hello world" program for an Axis A210 (cris architecture). I managed to get download GCC from the vendor, but it came with glibc, and the camera is running uClibc-0.9.27. I pulled the file /lib/libuClibc-0.9.27.so from the device.
I managed to compile this program that segfaults:
#include <unistd.h>
int main(int argc, char** argv)
{
*((unsigned int*)0) = 0xDEAD;
}
and this program that just hangs:
#include <unistd.h>
int main(int argc, char** argv)
{
int a = 0;
}
with cris-gcc -g -static -nostdlib -o compiled main.c.
Now I'd like to use the functions in libuClibc, but I can't seem to get the linking to work: I've tried
cris-gcc -g -static -nostdlib -o compiled main.c -luClibc-0.9.27 -L.
but that just gives:
./libuClibc-0.9.27.so: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
Is there a way to link to this .so file or to otherwise get some standard functions like exit working?
regarding:
cris-gcc -g -static -nostdlib -o compiled main.c -luClibc-0.9.27 -L.
The linker works with libraries in the order they are encountered. So they must be listed in the order needed.
The linker needs to know where the library is located before knowing which library to examine. Suggest:
cris-gcc -g -static -nostdlib -o compiled main.c -L. -luClibc-0.9.27
However, a *.so library is NOT a static library. It is a dynamic library, so the option: -static should be removed However, that requires that the dynamic library be available at 'run time' if the related *.a (a static library) is available then it should be used in the compile/link statement.
Note: the function: exit() has its' prototype exposed via the stdlib.h header file, not the unistd.h header file.
regarding:
#include <unistd.h>
int main(int argc, char** argv)
{
*((unsigned int*)0) = 0xDEAD;
}
the parameters: argc and argv are not used, so the compiler will output two warning statements about 'unused parameters'. Suggest using the function signature: int main( void )
this code is trying to write to address 0. However, the application does not 'own' address 0, (an usually, such an address will be 'marked' as 'readonly' so the application will exit with a 'seg fault event')
it is poor programming practice to include header files those contents are not used. Suggest removing the statement: #include <unistd.h>
this statement: int a = 0; will result in the compiler outputting a warning message about a variable that is 'set' but never 'used'
regarding:
cris-gcc -g -static -nostdlib -o compiled main.c -L. -luClibc-0.9.27
When compiling, should always enable the warnings, then fix those warnings. Suggest:
cris-gcc -Wall -Wextra -Wconversion -pedantic -std=c99 -g -static -nostdlib -o compiled main.c -luClibc-0.9.27 -L.
Apart of all the problems noticed by #user3629249 in his answer (all of them are to be followed), the message:
./libuClibc-0.9.27.so: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
means that the libuClibc-0.9.27.so binary has been stripped its symbols or you have not privileges to read the file, and so, the symbol table. The linker is unable to use that binary and it can only be loaded into memory. Anyway, you need a nonstripped shared object, and as suggested by #user3629249, don't use -static (by the reason stated in his answer), put the parameters in order (library dir before the library to be linked, also stated by him). Even you can link the shared by specifying it as:
cris-gcc -nostdlib -o compiled main.c libluClibc-0.9.27.so
and another thing: You need not only the standard C library to link an executable... you normally use a crt0.o at the beginning of your program with the C runtime and the start code for your program. You have not included that, and probably the compiler is getting it from another place.
One question: If you got the compiler, why do you intend to supply your own version of the standard library? isn't provided by the compiler? If you change the libc, then you must change also the crt0.o file. It defaults to some compiler provided, and you haven't received the message no definition for start.
Try to compile with just a main function, as you did, but don't specify shared libraries or directories... just the main code:
cris-gcc -o compiled main.c
and see what happens.... this will be very illustrative of what you lack in your system.

Shared library and rpath

I cannot make rpath work properly and make my binary to search for the library in the specified folder:
I have 3 very simple files:
main.c
#include <stdio.h>
#include <func.h>
int main() {
testing();
return 1;
}
func.h
void testing();
func.c
#include "func.h"
void testing(){
printf(testing\n");
}
Then I proceed to create a shared library as it follows:
gcc -c -fpic func.c -o ../release/func.o
gcc -shared -o ../release/lib/lib_func.so ../release/func.o
And then compile the program:
gcc main.c ../release/lib/lib_time_mgmt.so -Wl,-rpath=/home/root/ -o ../release/main
I receive the next warning:
main.c:7:2: warning: implicit declaration of function ‘testing’ [-Wimplicit-function-declaration]
testing();
But besides it, the program works fine.
However, my problem is that if now I want to move the library to /home/root (as specified in rpath) it does not work and the library is still searched only in the path specified when I compiled the main.c file which is ../release/lib/lib_time_mgmt.so
What am I doing wrong?
EDIT: After accepting the answer, I leave here the exact line as I used it and made it work for whoever might find it useful:
gcc main.c -L/home/root -Wl,-rpath,'/home/root/' -l:libtime_mgmt -o ${OUT_FILE}
Note: the rpath was used with the path betwen simple '. Not sure if that was the reason why it was not working before, but it worked this way now.
rpath is not used at compile time, but rather at link/runtime... thus you probably need to use both of these:
-L /home/root - to link correctly at build time
-Wl,-rpath=/home/root - to link correctly at run-time
You should use the -l ${lib} flag to link with libraries, don't specify their path as an input.
In addition to this, convention states that the libraries are named libNAME.so - e.g:
-l func will try to link with libfunc.so
-l time_mgmt will try to link with libtime_mgmt.so
Once you've addressed the above points, try the following:
gcc main.c -L/home/root -Wl,-rpath=/home/root -lfunc -ltime_mgmt -o ${OUT_FILE}
As a final point, I'd advise that you try not to use rpath, and instead focus on installing libraries in the correct places.
Unrelated to your question, but worth noting. Your use of #include <...> vs #include "..." is questionable. See: What is the difference between #include <filename> and #include "filename"?

gdb not loading symbols when running standalone shared library

I have a PIC shared library which also has a main function
#include <dtest2.h>
#include <stdio.h>
extern const char elf_interpreter[] __attribute__((section(".interp"))) = "/lib/ld-linux.so.2";
int dtestfunc1(int x,int y) {
int i=0;
int sum = 0;
for(i=0;i<=x;i++) {
sum+=y;
sum+=dtestfunc2(x,y);
}
return sum;
}
int main (int argc, char const* argv[])
{
printf("starting sharedlib main\n");
int val = dtestfunc1(4,5);
printf("val = %d\n",val);
_exit(0);
}
This library is linked to another shared library libdtest2 which has the implementation of dtestfunc2 called from dtestfunc1.
If i run gdb directly on dtestfunc1 using
gdb libdtest1.so
the symbols for libdtest2.so are not loaded by gdb. I cannot step into dtestfunc1 because of this and if i press s, the function just executes and gets out.
If i create a driver program which calls dlopen on the shared library, gdb loads the symbols properly after dlopen is executed and everything works fine.
Why is gdb behaving differently in both these cases?
How can i manually point gdb to the shared library if i am running gdb directly on the shared library?
Note: This is a toy example that mirrors my problem for a much larger shared library. All my binaries and libraries are compiled with the -ggdb3 flag.
Edit: My shared library is runnable. I added the proper interpreter path using the extern definition in the source code. I compiled it with gcc -ggdb3 -O0 -shared -Wl,-soname,libdtest1.so.1 -ldtest2 -L/usr/lib -Wl,-e,main -o libdtest1.so.1.0 dtest1.o.
I can run it and it executes perfectly.Running the shared library is not the issue here.
Why is gdb behaving differently in both these cases?
Because you didn't build libdtest1.so correctly for GDB to work with it.
In particular, your libdtest1.so is lacking DT_DEBUG entry in its dynamic section, which you can confirm like so:
readelf -d libdtest1.so | grep DEBUG
You should see nothing. In a correctly built runnable libdtest1.so (which you can build using -pie flag), the output should look like this:
0x00000015 (DEBUG) 0x0
The runtime loader updates DT_DEBUG to point to its r_debug structure, which then allows GDB to find other loaded shared libraries. Without DT_DEBUG, GDB can't find them.
Update:
After adding the pie flag my build command is gcc -ggdb3 -O0 -pie -shared -Wl,-soname,libdtest1.so.1 -ldtest2 -L/usr/lib -Wl,-e,main -o libdtest1.so.1.0 dtest1.c -I. The DEBUG section is still missing
Terminology: it's not a DEBUG section. It's a DT_DEBUG entry in the .dynamic section.
It is still missing because -shared overrides -pie. Remove -shared from the link line.
You would also not need -Wl,-e,main, nor would you need to specify the .interp -- GCC will do that for you.
The correct link command:
gcc -ggdb3 -O0 -pie -rdynamic dtest1.c -I. -Wl,-soname,libdtest1.so.1 \
-L/usr/lib -ldtest2 -o libdtest1.so.1.0
(Order of sources and libraries on the link line matters, yours is wrong.)
Added bonus: your main will receive correct argc and argv[], instead of bogus values you are getting now.
the easy fix.... recompile the library with (for gcc) the parameter '-ggdb' Then it will have all the symbols available to gdb.

Resources