Why am I getting segfault with loading objects from shared library? - c

Having this files:
plusone.c
int op(int i){ return i+1; }
main.c
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
int main(int argc, char **argv){
if (argc<3){
printf("usage %s <library> <number>\n",argv[0]);
exit(1);
}
char *lname = argv[1];
int num = atoi(argv[2]);
void *handle = dlopen(lname, RTLD_LAZY);
if(!handle)
perror("dlopen");
int (*opp)(int);
opp=dlsym(handle, "op");
if(!opp)
perror("dlsym");
printf("number before:%i\nnumber after:%i\n",num,opp(num));
dlclose(handle);
}
Compiled as:
$cc -fPIC -shared -o plusone.so -ldl plusone.c
$cc -o main.exe -ldl -Wpedantic main.c
warning: ISO C forbids assignment between function pointer and ‘void *’ [-Wpedantic]
$ls
main.c main.exe plusone.so main.exe
$main.exe
usage main.exe <library> <number>
$main plusone.so 1
dlopen: Success
dlsym: Success
Segmentation fault
Why is segfault?
As could be seen from the bash output, both the dlopen and dlsym give success (but they should not even output, otherwise that mean, the condition was true, and the returned values from those functions was NULL? - as from condition). But even of the "success" the perror returned, I cannot reproduce the segfault, since do not know where is the bug.

Why is segfault?
Most likely because opp equals NULL the moment opp(num) is trying to be called.
You do not handle errors correctly for the calls to dlopen() and dlysym(), although the code tests the results it does not take the correct actions on failure of those two functions.
This code
void *handle = dlopen(lname, RTLD_LAZY);
if(!handle)
perror("dlopen");
correctly branches on dlopen() returning NULL which indicated an error, but then the code takes the wrong actions.
dlopen() does not set errno, so using perror() to log an error makes no sense, as perror() relies on errno indicating an error, which is does not. So on failure of dlopen() you see perror() printing
dlopen: Success
which is misleading and contractionary to the fact that perror() was called at all, which in fact only happened if dlopen() returned NULL, indicating a failure. If dlopen() would have succeeded, perror() would not have been called at all and nothing would have been printed.
The same mistake appears with the call to dlsym().
To retrieve error info on failure of a member of the dl*() family of functions use dlerror().
For an example on how to correctly and completely implement error handling see below:
void *handle = dlopen(...);
if (!handle)
{
fprintf(stderr, "dlopen: %s\n", dlerror());
exit(EXIT_FAILURE); /* Or do what ever to avoid using the value of handle. */
}
#ifdef DEBUG
else
{
fputs("dlopen: Success\n", stderr);
}
#endif
The same approach should be taken to handle the outcome of dlsym().
Aside of all this and unrelated to the observed behaviour the code misses to call dlclose() when done with using a valid handle.

Related

How to handle error: expected expression before ‘do’ when there is no "do"?

I get the following compiler error, even though there is no "do" expression in my code.
gcc -Wall -g -c main.c -lasound
In file included from /usr/include/alsa/asoundlib.h:49:0,
from main.c:2:
main.c: In function ‘main’:
main.c:8:5: error: expected expression before ‘do’
if(snd_pcm_hw_params_alloca(&params) < 0) {
^
main.c:6:30: warning: unused variable ‘params’ [-Wunused-variable]
snd_pcm_hw_params_t *params;
^~~~~~
Makefile:15: recipe for target 'main.o' failed
make: *** [main.o] Error 1
From the following minimal reproducible example:
#include <stdio.h>
#include <alsa/asoundlib.h>
int main(int argc, char **argv)
{
snd_pcm_hw_params_t *params;
if(snd_pcm_hw_params_alloca(&params) < 0) {
return 0;
}
exit(0);
}
I'm aware this is not a valid ALSA program. I'm also aware that it appears snd_pcm_hw_params_alloca() doesn't even return anything worthwhile to check for errors against? That's not relevant though, this should valid C code regardless, even if it abuses the API.
Where is the "do" expression? If I go to /usr/include/alsa/asoundlib.h and poke around there, I don't see anything obvious that would indicate a problem.
If I remove the conditional if test, and get:
#include <stdio.h>
#include <alsa/asoundlib.h>
int main(int argc, char **argv)
{
snd_pcm_hw_params_t *params;
snd_pcm_hw_params_alloca(&params);
exit(0);
}
This will compile with no errors.
What is this?
If I look in pcm.h, I see:
#define snd_pcm_hw_params_alloca(ptr) __snd_alloca(ptr, snd_pcm_hw_params)
int snd_pcm_hw_params_malloc(snd_pcm_hw_params_t **ptr);
void snd_pcm_hw_params_free(snd_pcm_hw_params_t *obj);
void snd_pcm_hw_params_copy(snd_pcm_hw_params_t *dst, const snd_pcm_hw_params_t *src);
However, this doesn't tell me anything. Why does the compiler produce this error?
I'm also aware that it appears snd_pcm_hw_params_alloca() doesn't even return anything worthwhile to check for errors against? That's not relevant though, this should valid C code regardless, even if it abuses the API.
No, if snd_pcm_hw_params_alloca() does not have a value you cannot compare it against 0. For example, the following is also invalid:
void func(void) { }
void other(void) {
if (func() < 0) { // Error
}
}
In reality, snd_pcm_hw_params_alloca() is a macro, and it’s a wrapper for another macro, __snd_alloca. The do is there to make it behave more like a statement. You can only call it as a statement on its own line, or anywhere else where a do loop is legal.
snd_pcm_hw_params_alloca(&params);
You cannot check for errors because alloca() does not check for errors. If alloca() fails, it will just stomp on your stack, and bad things will happen. You can’t do anything about it, except not use alloca() (this is why you might hear advice to avoid alloca).
For an explanation of why the do loop is used, see: C multi-line macro: do/while(0) vs scope block
For more information about how alloca() works, see: Why is the use of alloca() not considered good practice?

dlsym returns NULL, even though the symbol exists

I am using dlsym to look up symbols in my program, but it always returns NULL, which I am not expecting. According to the manpage, dlsym may return NULL if there was an error somehow, or if the symbol indeed is NULL. In my case, I am getting an error. I will show you the MCVE I have made this evening.
Here is the contents of instr.c:
#include <stdio.h>
void * testing(int i) {
printf("You called testing(%d)\n", i);
return 0;
}
A very simple thing containing only an unremarkable example function.
Here is the contents of test.c:
#include <dlfcn.h>
#include <stdlib.h>
#include <stdio.h>
typedef void * (*dltest)(int);
int main(int argc, char ** argv) {
/* Declare and set a pointer to a function in the executable */
void * handle = dlopen(NULL, RTLD_NOW | RTLD_GLOBAL);
dlerror();
dltest fn = dlsym(handle, "testing");
if(fn == NULL) {
printf("%s\n", dlerror());
dlclose(handle);
return 1;
}
dlclose(handle);
return 0;
}
As I step through the code with the debugger, I see the dlopen is returning a handle. According to the manpage, If filename is NULL, then the returned handle is for the main program. So if I link a symbol called testing into the main program, dlsym should find it, right?
Here is the way that I am compiling and linking the program:
all: test
instr.o: instr.c
gcc -ggdb -Wall -c instr.c
test.o: test.c
gcc -ggdb -Wall -c test.c
test: test.o instr.o
gcc -ldl -o test test.o instr.o
clean:
rm -f *.o test
And when I build this program, and then do objdump -t test | grep testing, I see that the symbol testing is indeed there:
08048632 g F .text 00000020 testing
Yet the output of my program is the error:
./test: undefined symbol: testing
I am not sure what I am doing wrong. I would appreciate if someone could shed some light on this problem.
I don't think you can do that, dlsym works on exported symbols. Because you're doing dlsym on NULL (current image), even though the symbols are present in the executable ELF image, they're not exported (since it's not a shared library).
Why not call it directly and let the linker take care of it? There's no point in using dlsym to get symbols from the same image as your dlsym call. If your testing symbol was in a shared library that you either linked against or loaded using dlopen then you would be able to retrieve it.
I believe there's also a way of exporting symbols when building executables (-Wl,--export-dynamic as mentioned in a comment by Brandon) but I'm not sure why you'd want to do that.
I faced the similar issue in my code.
I did the following to export symbols
#ifndef EXPORT_API
#define EXPORT_API __attribute__ ((visibility("default")))
#endif
Now for each of the function definition I used the above attribute.
For example the earlier code was
int func() { printf(" I am a func %s ", __FUNCTION__ ) ;
I changed to
EXPORT_API int func() { printf(" I am a func %s ", __FUNCTION__ ) ;
Now it works.
dlsym gives no issues after this.
Hope this works for you as well.

How to compile a program using static library libdl.a

I am trying to compile the example code which is using APIs from libdl library:
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
int
main(int argc, char **argv)
{
void *handle;
double (*cosine)(double);
char *error;
handle = dlopen("libm.so", RTLD_LAZY);
if (!handle) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
dlerror(); /* Clear any existing error */
/* Writing: cosine = (double (*)(double)) dlsym(handle, "cos");
would seem more natural, but the C99 standard leaves
casting from "void *" to a function pointer undefined.
The assignment used below is the POSIX.1-2003 (Technical
Corrigendum 1) workaround; see the Rationale for the
POSIX specification of dlsym(). */
*(void **) (&cosine) = dlsym(handle, "cos");
if ((error = dlerror()) != NULL) {
fprintf(stderr, "%s\n", error);
exit(EXIT_FAILURE);
}
printf("%f\n", (*cosine)(2.0));
dlclose(handle);
exit(EXIT_SUCCESS);
}
I used the following command to compile:
--> gcc -static -o foo foo.c -ldl
I got the following error:
foo.c:(.text+0x1a): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
After google, since I am trying to compile it statically, I can find the libdl.a in the lib directory. I am getting the same issue with gethostbyname API also..
What are the other libraries needs to add to compile dl_open statically.
One possible problem:
dlsym()
Should be declared or implemented before it is referenced in main().
dlopen() works only with shared libs. That means that you can't link it statically. Have you tried without -static ?

How are library functions are linked in this case?

I just come across this code and the blog says this works fine on 32 bit architecture. I didn't test it; however, I have a doubt about the linkage of libraries in this case. How will the compiler link the string library to main since its not aware which library to link?
So basically if I include <string.h> then it should work fine; however, if I don't include <string.h> then, as per the blog, it runs in 32 bit architecture and fails to run on 64 bit architecture.
#include <errno.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *fp;
fp = fopen(argv[1], "r");
if (fp == NULL) {
fprintf(stderr, "%s\n", strerror(errno));
return errno;
}
printf("file exist\n");
fclose(fp);
return 0;
}
The code shown will only compile if you allow the compiler to infer that functions that are not declared always return an int. This was valid in C89/C90 but marked obsolescent; C99 and C11 require functions to be declared before they are used. GCC prior to version 5.1.0 assumes C90 mode by default; you had to turn the 'reject this code' warnings on. GCC 5.1.0 and onwards assumes C11 by default. You will at least get warnings from the code even without any compilation options to turn them on.
The code will link fine because the function name is strerror() regardless of whether it was declared or not, and the linker can find the function in the standard C library. In general, all the functions that are in the Standard C library are automatically made available for linking — and, indeed, there are usually a lot of not so standard functions also available. C does not have type-safe linkage as C++ does (but C++ also insists on having every function declared before it is used, so the code would not compile as C++ without the header.)
For historical reasons, the maths library was separate and you needed to specify -lm in order to link it. This was in large part because hardware floating point was not universal, so some machines needed a library using the hardware, and other machines needed software emulation of the floating point arithmetic. Some platforms (Linux, for example) still require a separate -lm option if you use functions declared in <math.h> (and probably <tgmath.h>); other platforms (Mac OS X, for example) do not — there is a -lm to satisfy build systems that link it, but the maths functions are in the main C library.
If the code is compiled on a fairly standard 32-bit platform with ILP32 (int, long, pointer all 32-bit), then for many architectures, assuming that strerror() returns an int assumes that it returns the same amount of data as if it returns a char * (which is what strerror() actually returns). So, when the code pushes the return value from strerror() onto the stack for fprintf(), the correct amount of data is pushed.
Note that some architectures (notably the Motorola M680x0 series) would return addresses in an address register (A0) and numbers in a general register (D0), so there would be problems even on those machines with a 32-bit compilation: the compiler would try to get the returned value from the data register instead of the address register, and that was not set by strerror() — leading to chaos.
With a 64-bit architecture (LP64), assuming strerror() returns a 32-bit int means that the compiler will only collect 32-bits of the 64-bit address returned by strerror() and push that on the stack for fprintf() to work with. When it tried to treat the truncated address as valid, things would go awry, often leading to a crash.
When the missing <string.h> header is added, the compiler knows that the strerror() function returns a char * and all is happiness and delight once more, even when the file the program is told to look for doesn't exist.
If you are wise, you will ensure your compiler is always compiling in fussy mode, rejecting anything which is plausibly erroneous. When I use my default compilation on your code, I get:
$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes \
> -Wstrict-prototypes -Wold-style-definition bogus.c -o bogus
bogus.c: In function ‘main’:
bogus.c:10:33: error: implicit declaration of function ‘strerror’ [-Werror=implicit-function-declaration]
fprintf(stderr, "%s\n", strerror(errno));
^
bogus.c:10:25: error: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Werror=format=]
fprintf(stderr, "%s\n", strerror(errno));
^
bogus.c:10:25: error: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Werror=format=]
bogus.c:4:14: error: unused parameter ‘argc’ [-Werror=unused-parameter]
int main(int argc, char *argv[])
^
cc1: all warnings being treated as errors
$
The 'unused argument' error reminds you that you should be checking that there is an argument to pass to fopen() before you try to open the file.
Fixed code:
#include <string.h>
#include <errno.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *fp;
if (argc != 2)
{
fprintf(stderr, "Usage: %s file\n", argv[0]);
return 1;
}
fp = fopen(argv[1], "r");
if (fp == NULL)
{
fprintf(stderr, "%s: file %s could not be opened for reading: %s\n",
argv[0], argv[1], strerror(errno));
return errno;
}
printf("file %s exists\n", argv[1]);
fclose(fp);
return 0;
}
Build:
$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes \
> -Wstrict-prototypes -Wold-style-definition bogus.c -o bogus
$
Run:
$ ./bogus bogus
file bogus exists
$ ./bogus bogus2
./bogus: file bogus2 could not be opened for reading: No such file or directory
$ ./bogus
Usage: ./bogus file
$
Note that the error messages include the program name and report to standard error. When the file is known, the error message includes the file name; it is much easier to debug that error if the program is in a shell script than if the message is just:
No such file or directory
with no indication of which program or which file encountered the problem.
When I remove the #include <string.h> line from the fixed code shown, then I can compile it and run it like this:
$ gcc -o bogus90 bogus.c
bogus.c: In function ‘main’:
bogus.c:18:35: warning: implicit declaration of function ‘strerror’ [-Wimplicit-function-declaration]
argv[0], argv[1], strerror(errno));
^
$ gcc -std=c90 -o bogus90 bogus.c
$ ./bogus90 bogus11
Segmentation fault: 11
$
This was tested with GCC 5.1.0 on Mac OS X 10.10.5 — which is, of course, a 64-bit platform.
I solved with including strings.h header
#include <string.h>
I don't think the functionality of this code would be affected by whether its 32-bit or 64-bit architecture: it doesn't matter if pointers are 32- or 64-bit, and if long int is 32 or 64 bit. Inclusion of headers, in this case string.h, should not affect linking to libraries, either. Header inclusion matters to the compiler, not linker. The compiler might warn about the function being implicitly declared, but as long as the linker can find the function in one of the libraries being searched by it, it will successfully link the binary, and it should run just fine.
I just built and ran this code successfully on a 64-bit CentOS box, using clang 3.6.2. I did get this compiler warning:
junk.c:10:33: warning: implicitly declaring library function 'strerror' with type 'char *(int)'
fprintf(stderr, "%s\n", strerror(errno));
^
junk.c:10:33: note: include the header <string.h> or explicitly provide a declaration for 'strerror'
1 warning generated.
The program was given a non-existent file name, and the error message, "No such file or directory," was meaningful. However, this is because the strerror() function is a well-known standard library function, and its declaration was correctly guessed by the compiler. If it is a user-defined function, the compiler may not be so "lucky" at guessing, and then the architecture can matter, as suggested by other answers.
So, the lesson learned: make sure function declarations are available to the compiler and heed the warnings!

Compile Attempt Gives crt1.o/'start'/undefined reference to 'main'/exit status message

I am working from a book: TCP/IP Sockets in C and its website code.
I am trying to build a client and server based on those files. My make gives lots of
error related to not being able to find functions from DieWithMessage.c
Here it is:
#include <stdio.h>
#include <stdlib.h>
#include "Practical.h"
void DieWithUserMessage(const char *msg, const char *detail) {
fputs(msg, stderr);
fputs(": ", stderr);
fputs(detail, stderr);
fputc('\n', stderr);
exit(1);
}
void DieWithSystemMessage(const char *msg) {
perror(msg);
exit(1);
}
When I do gcc DieWithMessage.c, I get the following error:
/usr/lib/i386-linux-gnu/gcc/i686-linux-gnu/4.5.2/../../../crt1.o: In function _start':
(.text+0x18): undefined reference tomain'
collect2: ld returned 1 exit status
How do I compile this by itself so that the errors will stop happening when using the makefile?
Thanks for any help.
Your C code needs a main function if you're going to try an link/run it. This is a requirement for hosted C applications under the standard.
That error message indicates what's wrong. The C runtime/startup code (CRT) has an entry point of start which sets up the environment then calls your main. Since you haven't provided a main, it complains.
If you only want to generate an object file for later linking with a main (see here for one description of the process), use something like:
gcc -c -o DieWithMessage.o DieWithMessage.c
(-c is the "compile but don't link" flag). You can then link it later with your main program with something like (although there are other options):
gcc -o myProg myProg.c DieWithMessage.o
If you want a placeholder main to update later with a real one, you can add the following to your code:
int main (void) { return 0; }

Resources