Can GCC not complain about undefined references? - c

Under what situation is it possible for GCC to not throw an "undefined reference" link error message when trying to call made-up functions?
For example, a situation in which this C code is compiled and linked by GCC:
void function()
{
made_up_function_name();
return;
}
...even though made_up_function_name is not present anywhere in the code (not headers, source files, declarations, nor any third party library).
Can that kind of code be accepted and compiled by GCC under certain conditions, without touching the actual code? If so, which?
Thanks.
EDIT: no previous declarations or mentions to made_up_function_name are present anywhere else. Meaning that a grep -R of the whole filesystem will only show that exact single line of code.

Yes, it is possible to avoid reporting undefined references - using --unresolved-symbols linker option.
g++ mm.cpp -Wl,--unresolved-symbols=ignore-in-object-files
From man ld
--unresolved-symbols=method
Determine how to handle unresolved symbols. There are four
possible values for method:
ignore-all
Do not report any unresolved symbols.
report-all
Report all unresolved symbols. This is the default.
ignore-in-object-files
Report unresolved symbols that are contained in shared
libraries, but ignore them if they come from regular object
files.
ignore-in-shared-libs
Report unresolved symbols that come from regular object
files, but ignore them if they come from shared libraries. This
can be useful when creating a dynamic binary and it is known
that all the shared libraries that it should be referencing
are included on the linker's command line.
The behaviour for shared libraries on their own can also be
controlled by the --[no-]allow-shlib-undefined option.
Normally the linker will generate an error message for each
reported unresolved symbol but the option --warn-unresolved-symbols can
change this to a warning.

TL;DR It can not complain, but you don't want that. Your code will crash if you force the linker to ignore the problem. It'd be counterproductive.
Your code relies on the ancient C (pre-C99) allowing functions to be implicitly declared at their point of use. Your code is semantically equivalent to the following code:
void function()
{
int made_up_function_name(...); // The implicit declaration
made_up_function_name(); // Call the function
return;
}
The linker rightfully complains that the object file that contains the compiled function() refers to a symbol that wasn't found anywhere else. You have to fix it by providing the implementation for made_up_function_name() or by removing the nonsensical call. That's all there's to it. No linker-fiddling involved.

If you declare the prototype of the function before using it , it shold compile. Anyway the error while linking will remain.
void made_up_function_name();
void function()
{
made_up_function_name();
return;
}

When you build with the linker flag -r or --relocatable it will also not produce any "undefined reference" link error messages.
This is because -r will link different objects in a new object file to be linked at a later stage.

And then there is this nastiness with the -D flag passed to GCC.
$cat undefined.c
void function()
{
made_up_function_name();
return;
}
int main(){
}
$gcc undefined.c -Dmade_up_function_name=atexit
$
Just imagine looking for the definition of made_up_function_name- it appears nowhere yet "does things" in the code.
I can't think of a nice reason to do this exact thing in code.
The -D flag is a powerful tool for changing code at compile time.

If function() is never called, it might not be included in the executable, and the function called from it is not searched for either.

The "standard" algorithm according to which POSIX linkers operate leaves open the possibility that the code will compile and link without any errors. See here for details: https://stackoverflow.com/a/11894098/187690
In order to exploit that possibility the object file that contains your function (let's call it f.o) should be placed into a library. That library should be mentioned in the command line of the compiler (and/or linker), but by that moment no other object file (mentioned earlier in the command line) should have made any calls to function or any other function present in f.o. Under such circumstances linker will see no reason to retrieve f.o from the library. Linker will completely ignore f.o, completely ignore function and, therefore, remain completely oblivious of the call to made_up_function_name. The code will compile even though made_up_function_name is not defined anywhere.

Related

Resolve undefined reference by stripping unused code

Assume we have the following C code:
void undefined_reference(void);
void bad(void) {
undefined_reference();
}
int main(void) {}
In function bad we fall into the linker error undefined reference to 'undefined_reference', as expected. This function is not actually used anywhere in the code, though, and as such, for the execution of the program, this undefined reference doesn't matter.
Is it possible to compile this code successfully, such that bad simply gets removed as it is never called (similar to tree-shaking in JavaScript)?
This function is not actually used anywhere in the code!
You know that, I know that, but the compiler doesn't. It deals with one translation unit at a time. It cannot divine out that there are no other translation units.
But main doesn't call anything, so there cannot be other translation units!
There can be code that runs before and after main (in an implementation-defined manner).
OK what about the linker? It sees the whole program!
Not really. Code can be loaded dynamically at run time (also by code that the linker cannot see).
So neither the compiler nor linker even try to find unused function by default.
On some systems it is possible to instruct the compiler and the linker to try and garbage-collect unused code (and assume a whole-program view when doing so), but this is not usually the default mode of operation.
With gcc and gnu ld, you can use these options:
gcc -ffunction-sections -Wl,--gc-sections main.c -o main
Other systems may have different ways of doing this.
Many compilers (for example gcc) will compile and link it correctly if you
Enable optimizations
make function bad static. Otherwise, it will have external linkage.
https://godbolt.org/z/KrvfrYYdn
Another way is to add the stump version of this function (and pragma displaying warning)

How to get unresolved externals in lib error?

yes I actually want to get that error. I am using MSVC (the command prompt). I would like to have .lib that would require definition of external symbol that gets linked to it. I must understand something wrong about static linking, because to me my approach seems legit:
I have a file that looks roughly like this:
extern INFO_BLOCK user_setup;
int
crtInit()
{
SetInfoBlock(&user_setup);
return 0;
}
when I try to use the .obj of this file in compilation with the main module
cl main.c file.obj it says unresolved externals. That is desired behaviour.
Nevertheless, once I pack the file.obj with lib file.obj even using /include:user_data (which frankly I don't trust as being of any use in this case)
using the .lib with cl main.c /link file.lib does not generate missing externals and that is the problem. I need the programmer to define that symbol. Does extern get stripped away once you put your .obj in .lib? Where am I wrong?
If main.c does not contain any reference to crtInit there is no reason for the linker to pull that function into the generated binary - Thus it will not "see" the unresolved reference to user_setup at all.
When you mention an object file to the linker, you force it to include the object file into the binary, regardless of whether it is needed by the program or not.
Opposed to that, when you mention a library to the linker, it will only use the library to resolve unresolved references you already have at that point with object files that it pulls in from this library. In case nothing is unresolved up to this point (or not satisfied by any symbol in the library), the linker will not use anything from the library at all.
The above is also the reason why many linkers are a bit picky about the order of libraries when linking (typically from specific to generic - or "user" to "system"), because linkers are normally single pass and will only pull in what they "see" needed at this specific point in the linking process.

Undefined reference to symbol on a static library

I'm trying to compile a binary linking it with a static library libfoo.a:
gcc -L. -o myapp myapp.o -lfoo
But I'm getting the following error from the linker:
libfoo.c:101: undefined reference to `_TRACE'
The problem is I don't have the source code for libfoo.a library.
I tried to get the reference for _TRACE symbol in the library and I got this:
nm libfoo.a | grep TRACE
U _TRACE
Assuming that _TRACE will not affect the inner workings in libfoo.a, is it possible to get the linker to define some placeholder value for this symbol so that I can compile my code?
Assuming that _TRACE will not affect the inner workings in libfoo.a
That seems an unreasonably hopeful assumption.
is it possible to get the linker to define some placeholder value for this symbol so that I can compile my code?
The first thing to do is to check libfoo's documentation. It is unusual for a static library to depend on a symbol that the user is expected to define; in fact, such an arrangement does not work cleanly with traditional linkers. I see several plausible explanations:
You need to link some other (specific) library after libfoo to provide a definition for that symbol.
Code that uses libfoo is expected to #include an associated header, and that header provides a tentative definition of _TRACE.
Programs that use libfoo are required to be built with a specific toolchain and maybe specific options.
It's just broken.
Only in case (4) is it appropriate to try to manually provide a definition for the symbol in question, and in that case your best bet is to proceed more or less as in case (1), by building an object that provides the definition and linking it after the library. Of course, that leaves you trying to guess what the definition should be.
If _TRACE is a global variable then defining it as an intmax_t with initial value 0 might work even if the library expects a different-size integer or one with different signedness. If it is supposed to be a function, however, then you're probably toast. There are too many signatures it could have, too many possible expectations for behavior. There is no reason to think that you could provide a suitable place-holder.
As I suspected, the _TRACE function is a sort of debugging function. And I was right assuming it would not affect the inner workings of libfoo.a.
I solved the problem defining the _TRACE function as:
int _TRACE(char*, ...) { return 0; }
Of course, this solution is only temporary and cannot be used in production, but it fits my purposes of compiling the code.
If you're using GCC 5.1 or later and/or C++11, there was an ABI change.
You can spot this issue using nm -C: if the symbol is defined (not U) but has a [abi:cxx11] appended to it, then it was compiled with the new ABI. From the link:
If you get linker errors about undefined references to symbols that involve types in the std::__cxx11 namespace or the tag [abi:cxx11] then it probably indicates that you are trying to link together object files that were compiled with different values for the _GLIBCXX_USE_CXX11_ABI macro.
If you have access to the source code (not your case specifically), you can use -Wabi-tag -D_GLIBCXX_USE_CXX11_ABI=0, the latter forcing the compiler not to use the new ABI.
All your code, including libraries, should be consistent.

How the compiler knows where my main function is?

I am working on a project that contains multiple modules (source files, header files, libraries). One of the files in all that soup contains my main function.
My questions are:
How the compiler knows which modules to compile and which not?
How does the compiler recognize the module with the main() inside?
The compiler itself doesn't care about what file contains which functions; main() is not special. However, in the linking stage, all these symbols from different files (and compilation units, possibly) are matched. The linker has a hidden "template" which has code at a fixed address that the OS will always call when you run a program. That code will call your main; hence, the linker looks for a main in all files. If it isn't there, you get an unresolved symbol error, exactly like if you used a function that you forgot to implement.
The same as for any other function applies to main: You can only have one implementation; having two main in two files that get linked together, you get a linker error, because the linker can't decide which of these to use.
How the compiler knows which modules to compile and which not?
It does not. You tell him which ones you want to compile, typically though the compilation statement(s) present in a makefile.
How does the compiler recognize the module with the main() inside?
Altogether it's a big process, already answered in this related question.
To summarize, while compiling a program with standard C library, the entry point of your program is set to _start. Now that has a reference to main() function internally. So, at compilation time, there is no (need for) checking the presence of main(). At linking time, linker should be able to locate one instance of main() which it can link to. That way, main() will serve as the entry point to your program.
So, to answer
How the compiler knows where my main function is?
It does (and need) not. It's the job of a linker, specifically.
The assembly code (often referred as startup code by embedded people) that starts up the program specifically calls main().
The prototype for main() is included in compiler documentation.
When you compile a program, an object file is produced. The object file from your source code is then linked with a startup runtime component (usually called crt0.o[bj]) and the C library components, etc.
If main() is changed to an unrecognizable signature, the compilation unit will complain about an unresolved external reference to _main or __main.

How to get all symbol conflict from 2 static libs in VC8

Say I have 2 static libs
ex1.a
ex2.a
In both libs I will define 10 same functions
When Compiling a sample test code say "test.c" , I link with both static libs ex1.a and ex2.a
In "test.c" I will call only 3 functions, then I will get the
linker error "same symbols deifned in both ex1.a and ex2.a libraries" This is Ok.
My Question here is :
1. Why this error only display 3 functions as multiple defined.. Why not it list all 10 functions
In VC8 How can I list all multiple defined symbols without actualy calling that function in test code ...
Thanks,
Thats because, linker tries to resovle a symbol name, when it compiles and links a code which has the function call. Only when the code has some function calls, linker would try to resolve it in either the test code or the libraries linked along and thats when it would find multiple definitions. If no function called, then I guess no problem.
What you experience is the optimizing part of the linker: By default it won't include code that isn't referenced. The compiler will create multiple object files with most likely unresolved dependencies (calls that couldn't be satisfied by the code included). So the linker takes all object files passed and tries to find solutions for the unresolved dependencies. If it fails, it will check the available library files. If there are multiple options with the same exact name/signature it will start complaining cause it won't be able to decide which one to pick (for identical code this won't matter but imagine different implementations using different "behind the scenes" work on memory, such as debug and release stuff).
The only (and possibly easiest way) I could think of to detect all these multiple definitions would be creating another static library project including all source files used in both static libs. When creating a library the linker will include everything called or exported - you won't need specific code calling the stuff for the linker to see/include everything as long as it's exported.
However I still don't understand what you're actually trying to accomplish as a whole. Trying to find code shared between two libraries?

Resources