Detection of unused function and code in C - c

Iam writing a program in C. Is there any way(gcc options) to identify the unused code and functions during compilation time.

If you use -Wunused-function, you will get warnings about unused functions. (Note that this is also enabled when you use -Wall).
See http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html for more details.

gcc -Wall will warn you about unused static functions and some other types of unreachable code. It will not warn about unused functions with external linkage, though, since that would make it impossible to write a library.

No, there is no way to do this at compile time. All the compiler does is create object code - it does not know about external code that may or may not call functions you write. Have you ever written a program that calls main? It is the linker that determines if a function (specifically, a symbol) is used in the application. And I think GCC will remove unused symbols by default.

Related

Resolve undefined reference by stripping unused code

Assume we have the following C code:
void undefined_reference(void);
void bad(void) {
undefined_reference();
}
int main(void) {}
In function bad we fall into the linker error undefined reference to 'undefined_reference', as expected. This function is not actually used anywhere in the code, though, and as such, for the execution of the program, this undefined reference doesn't matter.
Is it possible to compile this code successfully, such that bad simply gets removed as it is never called (similar to tree-shaking in JavaScript)?
This function is not actually used anywhere in the code!
You know that, I know that, but the compiler doesn't. It deals with one translation unit at a time. It cannot divine out that there are no other translation units.
But main doesn't call anything, so there cannot be other translation units!
There can be code that runs before and after main (in an implementation-defined manner).
OK what about the linker? It sees the whole program!
Not really. Code can be loaded dynamically at run time (also by code that the linker cannot see).
So neither the compiler nor linker even try to find unused function by default.
On some systems it is possible to instruct the compiler and the linker to try and garbage-collect unused code (and assume a whole-program view when doing so), but this is not usually the default mode of operation.
With gcc and gnu ld, you can use these options:
gcc -ffunction-sections -Wl,--gc-sections main.c -o main
Other systems may have different ways of doing this.
Many compilers (for example gcc) will compile and link it correctly if you
Enable optimizations
make function bad static. Otherwise, it will have external linkage.
https://godbolt.org/z/KrvfrYYdn
Another way is to add the stump version of this function (and pragma displaying warning)

How can gcc link to fortran IO libraries when c code calls fortran subroutine. Typical error from gcc is Undefined symbol "__gfortran_st_close" [duplicate]

I am trying to build a Fortran program, but I get errors about an undefined reference or an unresolved external symbol. I've seen another question about these errors, but the answers there are mostly specific to C++.
What are common causes of these errors when writing in Fortran, and how do I fix/prevent them?
This is a canonical question for a whole class of errors when building Fortran programs. If you've been referred here or had your question closed as a duplicate of this one, you may need to read one or more of several answers. Start with this answer which acts as a table of contents for solutions provided.
A link-time error like these messages can be for many of the same reasons as for more general uses of the linker, rather than just having compiled a Fortran program. Some of these are covered in the linked question about C++ linking and in another answer here: failing to specify the library, or providing them in the wrong order.
However, there are common mistakes in writing a Fortran program that can lead to link errors.
Unsupported intrinsics
If a subroutine reference is intended to refer to an intrinsic subroutine then this can lead to a link-time error if that subroutine intrinsic isn't offered by the compiler: it is taken to be an external subroutine.
implicit none
call unsupported_intrinsic
end
With unsupported_intrinsic not provided by the compiler we may see a linking error message like
undefined reference to `unsupported_intrinsic_'
If we are using a non-standard, or not commonly implemented, intrinsic we can help our compiler report this in a couple of ways:
implicit none
intrinsic :: my_intrinsic
call my_intrinsic
end program
If my_intrinsic isn't a supported intrinsic, then the compiler will complain with a helpful message:
Error: ‘my_intrinsic’ declared INTRINSIC at (1) does not exist
We don't have this problem with intrinsic functions because we are using implicit none:
implicit none
print *, my_intrinsic()
end
Error: Function ‘my_intrinsic’ at (1) has no IMPLICIT type
With some compilers we can use the Fortran 2018 implicit statement to do the same for subroutines
implicit none (external)
call my_intrinsic
end
Error: Procedure ‘my_intrinsic’ called at (1) is not explicitly declared
Note that it may be necessary to specify a compiler option when compiling to request the compiler support non-standard intrinsics (such as gfortran's -fdec-math). Equally, if you are requesting conformance to a particular language revision but using an intrinsic introduced in a later revision it may be necessary to change the conformance request. For example, compiling
intrinsic move_alloc
end
with gfortran and -std=f95:
intrinsic move_alloc
1
Error: The intrinsic ‘move_alloc’ declared INTRINSIC at (1) is not available in the current standard settings but new in Fortran 2003. Use an appropriate ‘-std=*’ option or enable ‘-fall-intrinsics’ in order to use it.
External procedure instead of module procedure
Just as we can try to use a module procedure in a program, but forget to give the object defining it to the linker, we can accidentally tell the compiler to use an external procedure (with a different link symbol name) instead of the module procedure:
module mod
implicit none
contains
integer function sub()
sub = 1
end function
end module
use mod, only :
implicit none
integer :: sub
print *, sub()
end
Or we could forget to use the module at all. Equally, we often see this when mistakenly referring to external procedures instead of sibling module procedures.
Using implicit none (external) can help us when we forget to use a module but this won't capture the case here where we explicitly declare the function to be an external one. We have to be careful, but if we see a link error like
undefined reference to `sub_'
then we should think we've referred to an external procedure sub instead of a module procedure: there's the absence of any name mangling for "module namespaces". That's a strong hint where we should be looking.
Mis-specified binding label
If we are interoperating with C then we can specify the link names of symbols incorrectly quite easily. It's so easy when not using the standard interoperability facility that I won't bother pointing this out. If you see link errors relating to what should be C functions, check carefully.
If using the standard facility there are still ways to trip up. Case sensitivity is one way: link symbol names are case sensitive, but your Fortran compiler has to be told the case if it's not all lower:
interface
function F() bind(c)
use, intrinsic :: iso_c_binding, only : c_int
integer(c_int) :: f
end function f
end interface
print *, F()
end
tells the Fortran compiler to ask the linker about a symbol f, even though we've called it F here. If the symbol really is called F, we need to say that explicitly:
interface
function F() bind(c, name='F')
use, intrinsic :: iso_c_binding, only : c_int
integer(c_int) :: f
end function f
end interface
print *, F()
end
If you see link errors which differ by case, check your binding labels.
The same holds for data objects with binding labels, and also make sure that any data object with linkage association has matching name in any C definition and link object.
Equally, forgetting to specify C interoperability with bind(c) means the linker may look for a mangled name with a trailing underscore or two (depending on compiler and its options). If you're trying to link against a C function cfunc but the linker complains about cfunc_, check you've said bind(c).
Not providing a main program
A compiler will often assume, unless told otherwise, that it's compiling a main program in order to generate (with the linker) an executable. If we aren't compiling a main program that's not what we want. That is, if we're compiling a module or external subprogram, for later use:
module mod
implicit none
contains
integer function f()
f = 1
end function f
end module
subroutine s()
end subroutine s
we may get a message like
undefined reference to `main'
This means that we need to tell the compiler that we aren't providing a Fortran main program. This will often be with the -c flag, but there will be a different option if trying to build a library object. The compiler documentation will give the appropriate options in this case.
There are many possible ways you can see an error like this. You may see it when trying to build your program (link error) or when running it (load error). Unfortunately, there's rarely a simple way to see which cause of your error you have.
This answer provides a summary of and links to the other answers to help you navigate. You may need to read all answers to solve your problem.
The most common cause of getting a link error like this is that you haven't correctly specified external dependencies or do not put all parts of your code together correctly.
When trying to run your program you may have a missing or incompatible runtime library.
If building fails and you have specified external dependencies, you may have a programming error which means that the compiler is looking for the wrong thing.
Not linking the library (properly)
The most common reason for the undefined reference/unresolved external symbol error is the failure to link the library that provides the symbol (most often a function or subroutine).
For example, when a subroutine from the BLAS library, like DGEMM is used, the library that provides this subroutine must be used in the linking step.
In the most simple use cases, the linking is combined with compilation:
gfortran my_source.f90 -lblas
The -lblas tells the linker (here invoked by the compiler) to link the libblas library. It can be a dynamic library (.so, .dll) or a static library (.a, .lib).
In many cases, it will be necessary to provide the library object defining the subroutine after the object requesting it. So, the linking above may succeed where switching the command line options (gfortran -lblas my_source.f90) may fail.
Note that the name of the library can be different as there are multiple implementations of BLAS (MKL, OpenBLAS, GotoBLAS,...).
But it will always be shortened from lib... to l... as in liopenblas.so and -lopenblas.
If the library is in a location where the linker does not see it, you can use the -L flag to explicitly add the directory for the linker to consider, e.g.:
gfortran -L/usr/local/lib -lopenblas
You can also try to add the path into some environment variable the linker searches, such as LIBRARY_PATH, e.g.:
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib
When linking and compilation are separated, the library is linked in the linking step:
gfortran -c my_source.f90 -o my_source.o
gfortran my_source.o -lblas
Not providing the module object file when linking
We have a module in a separate file module.f90 and the main program program.f90.
If we do
gfortran -c module.f90
gfortran program.f90 -o program
we receive an undefined reference error for the procedures contained in the module.
If we want to keep separate compilation steps, we need to link the compiled module object file
gfortran -c module.f90
gfortran module.o program.f90 -o program
or, when separating the linking step completely
gfortran -c module.f90
gfortran -c program.f90
gfortran module.o program.o -o program
Problems with the compiler's own libraries
Most Fortran compilers need to link your code against their own libraries. This should happen automatically without you needing to intervene, but this can fail for a number of reasons.
If you are compiling with gfortran, this problem will manifest as undefined references to symbols in libgfortran, which are all named _gfortran_.... These error messages will look like
undefined reference to '_gfortran_...'
The solution to this problem depends on its cause:
The compiler library is not installed
The compiler library should have been installed automatically when you installed the compiler. If the compiler did not install correctly, this may not have happened.
This can be solved by correctly installing the library, by correctly installing the compiler. It may be worth uninstalling the incorrectly installed compiler to avoid conflicts.
N.B. proceed with caution when uninstalling a compiler: if you uninstall the system compiler it may uninstall other necessary programs, and may render other programs unusable.
The compiler cannot find the compiler library
If the compiler library is installed in a non-standard location, the compiler may be unable to find it. You can tell the compiler where the library is using LD_LIBRARY_PATH, e.g. as
export LD_LIBRARY_PATH="/path/to/library:$LD_LIBRARY_PATH"
If you can't find the compiler library yourself, you may need to install a new copy.
The compiler and the compiler library are incompatible
If you have multiple versions of the compiler installed, you probably also have multiple versions of the compiler library installed. These may not be compatible, and the compiler might find the wrong library version.
This can be solved by pointing the compiler to the correct library version, e.g. by using LD_LIBRARY_PATH as above.
The Fortran compiler is not used for linking
If you are linking invoking the linker directly, or indirectly through a C (or other) compiler, then you may need to tell this compiler/linker to include the Fortran compiler's runtime library. For example, if using GCC's C frontend:
gcc -o program fortran_object.o c_object.o -lgfortran

GCC how to stop false positive warning implicit-function-declaration for functions in ROM?

I want to get rid of all implicit-function-declaration warnings in my codebase. But there is a problem because some functions are
programmed into the microcontroller ROM at the factory and during linking a linker script provides only the function address. These functions are called by code in the SDK.
During compilation gcc of course emits the warning implicit-function-declaration. How can I get rid of this warning?
To be clear I understand why the warning is there and what does it mean. But in this particular case the developers of SDK guarantee that the code will work with implicit rules (i.e. implicit function takes only ints and returns an int). So this warning is a false positive.
This is gnu-C-99 only, no c++.
Ideas:
Guess the argument types, write a prototype in a header and include that?
Tell gcc to treat such functions as false positive with some gcc attribute?
You can either create a prototype function in a header, or suppress the warnings with the following:
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wimplicit-function-declaration"
/* line where GCC complains about implicit function declaration */
#pragma GCC diagnostic pop
Write a small program that generates a header file romfunctions.h from the linker script, with a line like this
int rom_function();
for each symbol defined by the ROM. Run this program from your Makefiles. Change all of the files that use these functions to include romfunctions.h. This way, if the linker script changes, you don't have to update the header file by hand.
Because most of my programming expertise was acquired by self-study, I intentionally have become somewhat anal about resolving non-fatal warnings, specifically to avoid picking up bad coding habits. But, this has revealed to me that such bad coding habits are quite common, even from formally trained programmers. In particular, for someone like me who is also anal about NOT using MS Windows, my self-study of so-called platform-independent code such as OpenGL and Vulkan has revealed a WORLD of bad coding habits, particularly as I examine code written assuming the student was using Visual Studio and a Windows C/C++ compiler.
Recently, I encountered NUMEROUS non-fatal warnings as I designed an Ubuntu Qt Console implementation of an online example of how to use SPIR-V shaders with OpenGL. I finally threw in the towel and added the following lines to my qmake .PRO file to get rid of the non-fatal-warnings (after, first, studying each one and convincing myself it could be safely ignored) :
QMAKE_CFLAGS += -Wno-implicit-function-declaration
-Wno-address-of-packed-member
[Completely written due to commends]
You are compiling the vendor SDK with your own code. This is not typically what you want to do.
What you do is you build their SDK files with gcc -c -Wno-implicit-function-declaration and and your own files with gcc -c or possibly gcc -o output all-your-c-files all-their-o-files.
C does not require that declarations be prototypes, so you can get rid of the problem (which should be a hard error, not a warning, since implicit declarations are not valid C) by using a non-prototype declaration, which requires only knowing the return type. For example:
int foo();
Since "implicit declarations" were historically treated as returning int, you can simply use int for all of them.
If you are using C program, use
#include <stdio.h>

C compiles with an undefined symbol

I am using an older version of the Diab C compiler.
In my code I have taken a function name and redefined it as a function pointer with the same signature. Before making this change the code worked. After the change it made it caused the embedded system to lock up.
The function pointer was declared extern in a header, defined in one .c file, and used in another .c file. When it was called from the second .c file it would cause the system to lock up. When I attempted to add debug information using sprintf it finally told me that it was an undefined symbol. I realized that the header file was was not included in the second .c file. When I #included it everything compiled and worked correctly.
My question is, is there some C rule that allowed the compiler to deduce the function signature even though the symbol was undefined at the call location? To my understanding there should have been an error long before I made any changes.
If no declaration is available, the compiler uses a default declaration of a function taking an unknown number of arguments and returning an int. If you turn up compiler warnings (eg -Wall -Wextra -Werror with gcc, check the documentation for your compiler), you should get a compile time warning.
Most likely, the code at first worked because it was compiled in the C89 or similar mode. The C standard from 1989 allows calling functions without first declaring them.
When you changed the code to use a pointer but didn't include the declaration of the pointer, the compiler assumed that your pointer was in fact a function and generated code to call into the pointer, as if the pointer had executable code inside. As the result, the program understandably stopped working.
What you should do is enable all possible warnings (for gcc: -Wall, -Wextra and make sure optimization is enabled (-O2 is good) because it enables code analysis), especially for calling functions without prototypes. A better thing might be to switch the compiler into the C99 mode (-std=c99 in gcc) or switch to a C99 compiler. The C standard from 1999 prohibits calling functions without prototypes and comes with some useful features absent in C89.

Generating link-time error for deprecated functions

Is there a way with gcc and GNU binutils to mark some functions such that they will generate an error at link-time if used? My situation is that I have some library functions which I am not removing for the sake of compatibility with existing binaries, but I want to ensure that no newly-compiled binary tries to make use of the functions. I can't just use compile-time gcc attributes because the offending code is ignoring my headers and detecting the presence of the functions with a configure script and prototyping them itself. My goal is to generate a link-time error for the bad configure scripts so that they stop detecting the existence of the functions.
Edit: An idea.. would using assembly to specify the wrong .type for the entry points be compatible with the dynamic linker but generate link errors when trying to link new programs?
FreeBSD 9.x does something very close to what you want with the ttyslot() function. This function is meaningless with utmpx. The trick is that there are only non-default versions of this symbol. Therefore, ld will not find it, but rtld will find the versioned definition when an old binary is run. I don't know what happens if an old binary has an unversioned reference, but it is probably sensible if there is only one definition.
For example,
__asm__(".symver hidden_badfunc, badfunc#MYLIB_1.0");
Normally, there would also be a default version, like
__asm__(".symver new_badfunc, badfunc##MYLIB_1.1");
or via a Solaris-compatible version script, but the trick is not to add one.
Typically, the asm directive is wrapped into a macro.
The trick depends on the GNU extensions to define symbol versions with the .symver assembler directive, so it will probably only work on Linux and FreeBSD. The Solaris-compatible version scripts can only express one definition per symbol.
More information: .symver directive in info gas, Ulrich Drepper's "How to write shared libraries", the commit that deprecated ttyslot() at http://gitorious.org/freebsd/freebsd/commit/3f59ed0d571ac62355fc2bde3edbfe9a4e722845
One idea could be to generate a stub library that has these symbols but with unexpected properties.
perhaps create objects that have the name of the functions, so the linker in the configuration phase might complain that the symbols are not compatible
create functions that have a dependency "dont_use_symbol_XXX" that is never resolved
or fake a .a file with a global index that would have your functions but where the .o members in the archive have the wrong format
The best way to generate a link-time error for deprecated functions that you do not want people to use is to make sure the deprecated functions are not present in the libraries - which makes them one stage beyond 'deprecated'.
Maybe you can provide an auxilliary library with the deprecated function in it; the reprobates who won't pay attention can link with the auxilliary library, but people in the mainstream won't use the auxilliary library and therefore won't use the functions. However, it is still taking it beyond the 'deprecated' stage.
Getting a link-time warning is tricky. Clearly, GCC does that for some function (mktemp() et al), and Apple has GCC warn if you run a program that uses gets(). I don't know what they do to make that happen.
In the light of the comments, I think you need to head the problem off at compile time, rather than waiting until link time or run time.
The GCC attributes include (from the GCC 4.4.1 manual):
error ("message")
If this attribute is used on a function declaration and a call to such a function is
not eliminated through dead code elimination or other optimizations, an error
which will include message will be diagnosed. This is useful for compile time
checking, especially together with __builtin_constant_p and inline functions
where checking the inline function arguments is not possible through extern
char [(condition) ? 1 : -1]; tricks. While it is possible to leave the function
undefined and thus invoke a link failure, when using this attribute the problem
will be diagnosed earlier and with exact location of the call even in presence of
inline functions or when not emitting debugging information.
warning ("message")
If this attribute is used on a function declaration and a call to such a function is
not eliminated through dead code elimination or other optimizations, a warning
which will include message will be diagnosed. This is useful for compile time
checking, especially together with __builtin_constant_p and inline functions.
While it is possible to define the function with a message in .gnu.warning*
section, when using this attribute the problem will be diagnosed earlier and
with exact location of the call even in presence of inline functions or when not
emitting debugging information.
If the configuration programs ignore the errors, they're simply broken. This means that new code could not be compiled using the functions, but the existing code can continue to use the deprecated functions in the libraries (up until it needs to be recompiled).

Resources