Finding the root cause of `undefined reference` error - c

I'm trying to understand why I'm getting an undefined reference error during linking:
/home/amirgon/projects/esp8266/esp-open-sdk/xtensa-lx106-elf/bin/xtensa-lx106-elf-gcc -L/home/amirgon/projects/esp8266/esp-open-sdk/sdk/lib -T/home/amirgon/projects/esp8266/esp-open-sdk/sdk/ld/eagle.app.v6.cpp.ld -nostdlib -Wl,--no-check-sections -u call_user_start -Wl,-static -Wl,--start-group -lc -lgcc -lhal -lphy -lpp -lnet80211 -llwip -lwpa -lmain build/app_app.a -Wl,--end-group -o build/app.out
build/app_app.a(routines.o):(.text+0x4): undefined reference to `pvPortMalloc(unsigned int, char const*, int)'
gcc complains it could not find the function pvPortMalloc.
However, I can confirm this function exists in libmain.a!
In the command line above, libmain is referenced by -lmain and library path is set to -L/home/amirgon/projects/esp8266/esp-open-sdk/sdk/lib.
When I dump symbols from libmain.a on that path I can find pvPortMalloc marked as T, which means that the symbol is in the text (code) section:
/home/amirgon/projects/esp8266/esp-open-sdk/xtensa-lx106-elf/bin/xtensa-lx106-elf-nm -g /home/amirgon/projects/esp8266/esp-open-sdk/sdk/lib/libmain.a | grep pvPortMalloc
U pvPortMalloc
0000014c T pvPortMalloc
U pvPortMalloc
So, did I miss something?
what could be the reason that gcc does not find the function although it exists in libmain.a?
How can I further debug this error?

Mixing of C++ and C code causes your issue.
This error:
undefined reference to `pvPortMalloc(unsigned int, char const*, int)'
Does not say that the symbol pvPortMalloc cannot be found. It says that the symbol pvPortMalloc(unsigned int, char const*, int) cannot be found,
and that is a C++ symbol.
This means that somewhere you are compiling C++ code which thinks there is a C++ pvPortMalloc function, whose symbol also includes its signature, but you only have a pvPortMalloc C function.
Likely your C++ code is including a header file that is not C++ clean, and you will need to do something like this:
extern "C" {
#include "some_header.h"
}
Where some_header.h is the header file declaring the pvPortMalloc function.

Not only order of object files and libraries on the command line is important, but also the order of object files within a library.
Anything that resolves a reference must come after the symbol is used, otherwise you might get strange linking problems.
The effect you see is a typical problem of a library that has been built with ar and the wrong object file order (some .o file using an external function that is defined in some .o file before the one that uses this symbol in the lib).
ranlib <libfile> is the tool that fixes this by creating an index for all objects in the library and should get rid of this problem.

Related

Different behavior of undefined reference error on linux gcc during linking with object file vs static library

I have following two source codes and want to link them.
// test.c
#include <stdio.h>
void lib2();
void lib1(){
lib2();
return 0;
}
// main.c
#include <stdio.h>
int main() {
return 0;
}
I've used gcc -c main.c and gcc -c test.c to generate objects files
$ ls *.o
main.o test.o
and I've used ar rcs test.a test.o command to generate static library(test.a) from object file test.o
Then, I tried to build executable by linking main.o with test.a or test.o. As far as I know, a static library file(.a extension) is a kind of simple collection of object files(.o). so I expected both would give same result: error or success. but it didn't.
Linking with the object file gives undefined reference error.
$ gcc -o main main.o test.o
/usr/bin/ld: test.o: in function `lib1':
test.c:(.text+0xe): undefined reference to `lib2'
collect2: error: ld returned 1 exit status
$
but linking with the static library doesn't give any error and success on compilation.
$ gcc -o main main.o test.a
$
Why is this happening? and how can I get undefined reference errors even when linking with static libraries?
If your code contains a function call expression then the language standard requires a function definition exists. (See C11 6.9/3). If you don't provide a definition then it is undefined behaviour with no diagnostic required .
The rule was written this way so that implementation vendors aren't forced to perform analysis to determine if a function is ever called or not; for example in your library scenario the compiler isn't forced to dig around in the library if none of the rest of the code contains anything that references that library.
It's totally up to the implementation what to do, and in your case it decides to give an error in one case and not the other. To avoid this, you can provide definitions for all the functions you call.
You might be able to modify the behaviour in the first case by using linker options such as elimination of unused code sections. Another thing you can do is call lib1() from main() -- this is still not guaranteed to produce an error but is more likely to.
Force the linker to do some work use -flto option and the error will go away.
ld does not search libraries for objects which are not used it only searches for symbols used in object files. Imagine that you have a library where some functions require defined callbacks. If you do not have them in every program you link against the library even if you do not use those functions.
I expected both would give same result: error or success. but it didn't.
Your expectation is incorrect. A good explanation of the difference between .o and .a with respect to linking is here.

Why is clang removing an underscore from a function declared as 'extern "C"'?

I'm watching a video in an attempt to better understand object files. The presenter uses the following as an example of a program that produces a very simple object file:
extern "C" void _start() {
asm("mov $60, %eax\n"
"mov $24567837, %edi\n"
"syscall\n");
}
The program is compiled via
clang++ -c step0.cpp -O1 -o step0.o
and linked via
ld -static step0.o -o step0
I get this error message when trying to link:
Undefined symbols for architecture x86_64:
"start", referenced from:
-u command line option
(maybe you meant: __start)
ld: symbol(s) not found for inferred architecture x86_64
I don't pass the -u command line option, so I'm not sure why I'm getting that error message.
clang isn't removing an underscore, it's adding an underscore. Your program is actually exporting a __start symbol, but ld expects you to have a start symbol for your entry point, i.e. ld runs with -u start by default for your architecture.
You could disable this check in ld with -U start (which suppresses the error from the start symbol being undefined) or via -undefined suppress (which suppresses all undefined symbol errors). However, you will end up with an executable that does not have an entry point for your architecture, so the program won't actually work.
Instead of suppressing the error, I suggest controlling the symbol that clang chooses directly. You can tell clang what symbol to generate by using a standalone asm declaration:
void _start() asm ("start");
Make sure this standalone declaration is separate from the function definition.
You can read more about controlling the symbols generated by gcc here: https://stackoverflow.com/a/1035937/12928775
Also, as was pointed out in a comment to a similar answer, you will most likely want to use __attribute__((naked)) on the function definition to prevent clang from generating a stack frame on entry. See: https://stackoverflow.com/a/60311490/12928775

error while creating shared object (so ) in gcc [duplicate]

Getting this error while compiling C++ code:
undefined reference to `__stack_chk_fail'
Options already tried:
added -fno-stack-protector while compiling - did not work, error persists
added a dummy implementation of void __stack_chk_fail(void) in my code. Still getting the same error.
Detailed Error:
/u/ac/alanger/gurobi/gurobi400/linux64/lib/libgurobi_c++.a(Env.o)(.text+0x1034): In function `GRBEnv::getPar/u/ac/alanger/gurobi/gurobi400/linux64/lib/libgurobi_c++.a(Env.o)(.text+0x1034): In function `GRBEnv::getParamInfo(GRB_StringParam, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)':
: undefined reference to `__stack_chk_fail'
amInfo(GRB_StringParam, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)':
: **undefined reference to `__stack_chk_fail'**
Earlier, I was getting 10's of such errors. Found out that there was a version mismatch between the gcc of the pre-compiled libraries I am using and the gcc version I was using to compile the code. Updated gcc and now I am getting only 2 of these errors.
Any help, please?
libgurobi_c++.a was compiled with -fno-stack-protector (obviously).
A few things come to mind:
add -fstack-protector when linking. This will make sure that libssp gets linked.
Manually link -lssp
Make your dummy version of __stack_chk_fail(void) in it's own object file and and add this .o file to your linker command AFTER libgurobi_c++.a. GCC/G++ resolves symbols from left to right during linking so despite your code having the function defined, a copy of an object containing the __stack_chk_fail symbol needs to be on the linker line to the right of libgurobi_c++.a.
In gentoo I had the same problem and i resolved creating 2 files. The first contain the option to be parsed by emerge and passed to gcc:
/etc/portage/env/nostackprotector.conf
CFLAGS="-fno-stack-protector -O2"
And the second tells which package should use this settings:
/etc/portage/package.env/nostackprotector
x11-libs/vte nostackprotector.conf
sys-libs/glibc nostackprotector.conf
www-client/chromium nostackprotector.conf
app-admin/sudo nostackprotector.conf
Just had the same issue: c++ code with an implementation of void __stack_chk_fail(void) showing several undefined reference to __stack_chk_fail errors when compiling.
My solution was to define __stack_chk_fail(void) as extern "C":
extern "C" {
__stack_chk_fail(void)
{
...
}
}
This suppressed the compilation error :)
Hope it helps!
https://wiki.ubuntu.com/ToolChain/CompilerFlags
says:
"Usually this is a result of calling ld instead of gcc during a build to perform linking"
This is what I encountered when modified the Makefile of libjpeg manually. Use gcc instead of ld solved the problem.

Using LD_PRELOAD to overload call to a C function of a shared library

I'm following this answer to override a call to a C function of a C library.
I think I did everything correctly, but it doesn't work:
I want to override the "DibOpen"-function. This is my code of the library which I pass to LD_PRELOAD environment-variable when running my application:
DIBSTATUS DibOpen(void **ctx, enum Board b)
{
printf("look at me, I wrapped\n");
static DIBSTATUS (*func)(void **, enum Board) = NULL;
if(!func)
func = dlsym(RTLD_NEXT, "DibOpen");
printf("Overridden!\n");
return func(pContextAddr, BoardType, BoardHdl);
}
The output of nm lib.so | grep DibOpen shows
000000000001d711 T DibOpen
When I run my program like this
LD_PRELOAD=libPreload.so ./program
I link my program with -ldl but ldd program does not show a link to libdl.so
It stops with
symbol lookup error: libPreload.so: undefined symbol: dlsym
. What can I do to debug this further? Where is my mistake?
When you create a shared library (whether or not it will be used in LD_PRELOAD), you need to name all of the libraries that it needs to resolve its dependencies. (Under some circumstances, a dlopened shared object can rely on the executable to provide symbols for it, but it is best practice not to rely on this.) In this case, you need to link libPreload.so against libdl. In Makefile-ese:
libPreload.so: x.o y.o z.o
$(CC) -shared -Wl,-z,defs -Wl,--as-needed -o $# $^ -ldl
The option -Wl,-z,defs tells the linker that it should issue an error if a shared library has unresolved undefined symbols, so future problems of this type will be caught earlier. The option -Wl,--as-needed tells the linker not to record a dependency on libraries that don't actually satisfy any undefined symbols. Both of these should be on by default, but for historical reasons, they aren't.

Undefined Symbol _memset although _memset nowhere used? [duplicate]

I asked a similar question, but I have some update which is really confusing me. Essentially, I want to link a number of object files with the linker as follows:
/usr/ccs/bin/ld -o q -e start_master -dn -z defs -M ../../../mapfile.q {list of object files}
I get the following error:
Undefined first referenced
symbol in file
_memset reconf.o
The interesting things is, that memset is not referenced in reconf.c and I also grep'ed the whole directory but there is also no reference in any of the other files to _memset. Therefore I am wondering why I get this error message from the linker, although nowhere in my source code _memset is actually used. Anyone an idea what could be going on here?
Thanks so much, this error is driving us mental!
EDIT:
I tried to add the path to the library of memset and linked it with -lc and run it in verbose mode:
/usr/ccs/bin/ld -o q -e start_master -dn -z defs -z verbose -L/usr/lib -M ../../../mapfile.q {list of object files} -lc
Then I get the following error:
ld: fatal: library -lc: not found
ld: fatal: File processing errors. No output written to q
And this although libc.so is clearly in /usr/lib ...
Confusing
EDIT II:
Doing some more research it seems that on Solaris 10 static linking disappeard as you can read here:
http://blogs.oracle.com/rie/entry/static_linking_where_did_it
Probably this is my problem. Has anyone an idea how I could rewrite my linker command for a workaround to this problem?
Many thanks!
Probably you did:
struct S v = { 0 };
or
struct S v;
v = (some const-variable).
or
uint8_t b[100] = { 0 };
.
Some compilers are putting implicitly the built-in memset (or memcpy) for such things. The built-in memset then is called _memset (in your case). Once you link and your libc (or what provides standard-function in your case) does not providie it, you are getting this link error.
Assuming you're on Solaris, you'll find memset in the libc.so library :
/usr/lib-> nm libc.so | grep memset
[7122] | 201876| 104|FUNC |GLOB |0 |9 |_memset
Simply add -lc to the command line
Memset is a library function from standard C library. If you don't use gcc for linking (which links your files with standard libraries by default) you should explicitly link your progrom with libc.
On the other option, probably you don't use libc. In this case memset call could be generated by gcc.
From man gcc:
-nodefaultlibs
Do not use the standard system libraries when linking. Only the libraries you specify will be passed to the linker, options specifying linkage of the system libraries, such as -static-libgcc or -shared-libgcc, will be ignored. The standard startup files are used normally, unless -nostartfiles is used. The compiler may generate calls to memcmp, memset, memcpy and memmove. These entries are usually resolved by entries in libc. These entry points should be supplied through some other mechanism when this option is specified.
In this case simply write memset (it's trivial proc.) and supply it to linker.

Resources