Is C compiler able to optimize across object file? - c

I'm considering between header-only vs. header&source design. I'm not sure if the header&source allows compiler to optimize across object files and across linkage ? such as inlining optimization ?

Header files and source files typically compiled as a single translation unit (since headers are included in the source files). So, that won't be an issue (unless you have a peculiar environment where
headers are compiled separately).
GCC does support optimizations across different translation units. See Link Time Optimization.
See the -flto option's documentation for details:
-flto[=n]
This option runs the standard link-time optimizer. When invoked with
source code, it generates GIMPLE (one of GCC's internal
representations) and writes it to special ELF sections in the object
file. When the object files are linked together, all the function
bodies are read from these ELF sections and instantiated as if they
had been part of the same translation unit. To use the link-time
optimizer, -flto and optimization options should be specified at
compile time and during the final link. It is recommended that you
compile all the files participating in the same link with the same
options and also specify those options at link time.

Related

Does inline function result in creation of weak symbols?

Programming Language : C
At our work,we have a project which has a header file say header1.h . This file contains some function which are declared as external scope (via extern) and also defined as inline in the same header file(header1.h).
Now this file is included at several places in different C files.
My understanding is that it will produce an error of multiple definitions with my past experience with GCC , and that is what I expect. But at our work we do not get these errors. Only difference is that we are using different compiler driver.
From my past experience, the best guess that I am making is that, the symbols are generated as weak symbols at the time of compilation and linker is using that information to choose one of them.
Could functions defined as inline result in weak symbols ? Is it possible, or there might be some other reason.
Also if inline can result in creation of weak symbols ,would there be a feature to turn it off or on.
If a function is inline, the entire function body will be copied in every time the function is used (instead of the normal assembler call/return semantic).
(Modern compilers, uses inline as a hint, and the actual result might just be a static function, with a unique copy in every compiled file it was used)

Why does't the compiler inline functions written on different source files?

I've been doing some tests with Valgrind to understand how functions are translated by the compiler and have found that, sometimes, functions written on different files perform poorly compared to functions written on the same source file due to not being inlined.
Considering I have different files, each containing functions related to a particular area and all files share a common header declaring all functions, is this expected?
Why doesn't the compiler inline them when they are written on different files but does when they are on the same page?
If this behavior starts to cause performance issues, what is the recommended course of action, put all of the functions on the same file manually before compiling?
example:
//source 1
void foo(char *str1, char *str2)
{
//here goes the code
}
//source 2
void *bar(int something, char *somethingElse)
{
//bar code
foo(variableInsideBar, anotherVariableCreatedInsideBar);
return variableInsideBar;
}
Sample performance cost:
On different files: 29920
Both on the same file: 8704
For bigger functions it is not as pronounced, but still happens.
If you are using gcc you should try the options -combine and -fwhole-program and pass all the source files to the compiler in one invocation. Traditionally different C files are compiled separately, but it is becoming more common to optimize cross compilation units (files).
The compiler proper cannot inline functions defined in different translation units simply because it cannot see the definitions of these functions, i.e. it cannot see the source code for these functions. Historically, C compilers (and the language itself) were built around the principles of independent translation. Each translation unit is compiled from source code into an object code completely independently form other translation units. Only at the very last stage of translation all these disjoint pieces of object code are assembled together into a final program by so called linker. But in a traditional compiler implementation at that point it is already too late to inline anything.
As you probably know, the language-level support for function inlining says that in order for a function to be "inlinable" in some translation unit it has to be defined in that translation unit, i.e. the source code for its body should be visible to the compiler in that translation unit. This requirement stems directly from the aforementioned principle of independent translation.
Many modern compilers are gradually introducing features that overcome the limitations of the classic pure independent translation. They implement features like global optimizations which allow various optimizations that cross the boundaries of translation units. That potentially includes the ability to inline functions defined in other translation units. Consult your compiler documentation in order to see whether it can inline functions across translation units and how to enable this sort of optimizations.
The reason such global optimizations are usually disabled by default is that they can significantly increase the translation time.
Wow, how you noticed that. I think that's because then you compile something, at first compiler turn one c file to one object file without looking at any other files. After it made object file, it doesn't apply any optimisations.
I don't think it costs much perfomance.

What is responsible for ensuring all symbols are known/defined?

Is it the C preprocessor, compiler, or linkage editor?
To tell you the truth, it is programmer.
The answer you are looking for is... the compiler it depends. Sometimes it's the compiler, sometimes it's the linker, and sometimes it doesn't happen until the program is loaded.
The preprocessor:
handles directives for source file inclusion (#include), macro definitions (#define), and conditional inclusion (#if).
...
The language of preprocessor directives is agnostic to the grammar of C, so the C preprocessor can also be used independently to process other kinds of text files.
The linker:
takes one or more objects generated by a compiler and combines them into a single executable program.
...
Computer programs typically comprise several parts or modules; all
these parts/modules need not be contained within a single object file,
and in such case refer to each other by means of symbols. Typically,
an object file can contain three kinds of symbols:
defined symbols, which allow it to be called by other modules,
undefined symbols, which call the other modules where these symbols are defined, and
local symbols, used internally within the object file to facilitate relocation.
When a program comprises multiple object files, the linker combines
these files into a unified executable program, resolving the
symbols as it goes along.
In environments which allow dynamic linking, it is possible that
executable code still contains undefined symbols, plus a list of objects or libraries that will provide definitions for these.
The programmer must make sure everything is defined somewhere. The programmer is RESPONSIBLE for doing so.
Various tools will complain along the way if they notice anything missing:
The compiler will notice certain things missing, and will error out if it can realize that something's not there.
The linker will error out if it can't fix up a reference that's not in a library somewhere.
At run time there is a loader that pulls the relevant shared libraries into the process's memory space. The loader is the last thing that gets a crack at fixing up symbols before the program gets to run any code, and it will throw errors if it can't find a shared library/dll, or if the interface for the library that was used at link-time doesn't match up correctly with the available library.
None of these tools is RESPONSIBLE for making sure everything is defined. They are just the things that will notice if things are NOT defined, and will be the ones throwing the error message.
For symbols with internal linkage or no linkage: the compiler.
For symbols with external linkage: the linker, either the "traditional" one, or the runtime linker.
Note that the dynamic/runtime linker may choose to do its job lazily, resolving symbols only when they are used (e.g: when a function is called for the first time).

Single Source Code vs Multiple Files + Libraries

How much effect does having multiple files or compiled libraries vs. throwing everything (>10,000 LOC) into one source have on the final binary? For example, instead of linking a Boost library separately, I paste its code, along with my original source, into one giant file for compilation. And along the same line, instead of feeding several files into gcc, pasting them all together, and giving only that one file.
I'm interested in the optimization differences, instead of problems (horror) that would come with maintaining a single source file of gargantuan proportions.
Granted, there can only be link-time optimization (I may be wrong), but is there a lot of difference between optimization possibilities?
If the compiler can see all source code, it can optimize better if your compiler has some kind of Interprocedural Optimization (IPO) option turned on. IPO differs from other compiler optimization because it analyzes the entire program; other optimizations look at only a single function, or even a single block of code
Here is some interprocedural optimization that can be done, see here for more:
Inlining
Constant propagation
mod/ref analysis
Alias analysis
Forward substitution
Routine key-attribute propagation
Partial dead call elimination
Symbol table data promotion
Dead function elimination
Whole program analysis
GCC supports this kind of optimization.
This interprocedural optimization can be used to analyze and optimize the function being called.
If compiler can not see the source code of the library function, it cannot do such optimization.
Note that some modern compilers (clang/LLVM, icc and recently even gcc) now support link-time-optimization (LTO) to minimize the effect of separate compilation. Thus you gain the benefits of separate compilation (maintenance, faster compilation, etc.) and these of whole program analysis.
By the way, it seems like gcc has supported -fwhole-program and --combine since version 4.1. You have to pass all source files together, though.
Finally, since BOOST is mostly header files (templates) that are #included, you cannot gain anything from adding these to your source code.

How do linkers decide what parts of libraries to include?

Assume library A has a() and b(). If I link my program B with A and call a(), does b() get included in the binary? Does the compiler see if any function in the program call b() (perhaps a() calls b() or another lib calls b())? If so, how does the compiler get this information? If not, isn't this a big waste of final compile size if I'm linking to a big library but only using a minor feature?
Take a look at link-time optimization. This is necessarily vendor dependent. It will also depend how you build your binaries. MS compilers (2005 onwards at least) provide something called Function Level Linking -- which is another way of stripping symbols you don't need. This post explains how the same can be achieved with GCC (this is old, GCC must've moved on but the content is relevant to your question).
Also take a look at the LLVM implementation (and the examples section).
I suggest you also take a look at Linkers and Loaders by John Levine -- an excellent read.
It depends.
If the library is a shared object or DLL, then everything in the library is loaded, but at run time. The cost in extra memory is (hopefully) offset by sharing the library (really, the code pages) between all the processes in memory that use that library. This is a big win for something like libc.so, less so for myreallyobscurelibrary.so. But you probably aren't asking about shared objects, really.
Static libraries are a simply a collection of individual object files, each the result of a separate compilation (or assembly), and possibly not even written in the same source language. Each object file has a number of exported symbols, and almost always a number of imported symbols.
The linker's job is to create a finished executable that has no remaining undefined imported symbols. (I'm lying, of course, if dynamic linking is allowed, but bear with me.) To do that, it starts with the modules named explicitly on the link command line (and possibly implicitly in its configuration) and assumes that any module named explicitly must be part of the finished executable. It then attempts to find definitions for all of the undefined symbols.
Usually, the named object modules expect to get symbols from some library such as libc.a.
In your example, you have a single module that calls the function a(), which will result in the linker looking for module that exports a().
You say that the library named A (on unix, probably libA.a) offers a() and b(), but you don't specify how. You implied that a() and b() do not call each other, which I will assume.
If libA.a was built from a.o and b.o where each defines the corresponding single function, then the linker will include a.o and ignore b.o.
However, if libA.a included ab.o that defined both a() and b() then it will include ab.o in the link, satisfying the need for a(), and including the unused function b().
As others have mentioned, there are linkers that are capable of splitting individual functions out of modules, and including only those that are actually used. In many cases, that is a safe thing to do. But it is usually safest to assume that your linker does not do that unless you have specific documentation.
Something else to be aware of is that most linkers make as few passes as they can through the files and libraries that are named on the command line, and build up their symbol table as they go. As a practical matter, this means that it is good practice to always specify libraries after all of the object modules on the link command line.
It depends on the linker.
eg. Microsoft Visual C++ has an option "Enable function level linking" so you can enable it manually.
(I assume they have a reason for not just enabling it all the time...maybe linking is slower or something)
Usually (static) libraries are composed of objects created from source files. What linkers usually do is include the object if a function that is provided by that object is referenced. if your source file only contains one function than only that function will be brought in by the linker. There are more sophisticated linkers out there but most C based linkers still work like outlined. There are tools available that split C source that contain multiple functions into artificially smaller source files to make static linking more fine granular.
If you are using shared libraries then you don't impact you compiled size by using more or less of them. However your runtime size will include them.
This lecture at Academic Earth gives a pretty good overview, linking is talked about near the later half of the talk, IIRC.
Without any optimization, yes, it'll be included. The linker, however, might be able to optimize out by statically analyzing the code and trying to remove unreachable code.
It depends on the linker, but in general only functions that are actually called get included in the final executable. The linker works by looking up the function name in the library and then using the code associated with the name.
There are very few books on linkers, which is strange when you think how important they are. The text for a good one can be found here.
It depends on the options passed to the linker, but typically the linker will leave out the object files in a library that are not referenced anywhere.
$ cat foo.c
int main(){}
$ gcc -static foo.c
$ size
text data bss dec hex filename
452659 1928 6880 461467 70a9b a.out
# force linking of libz.a even though it isn't used
$ gcc -static foo.c -Wl,-whole-archive -lz -Wl,-no-whole-archive
$ size
text data bss dec hex filename
517951 2180 6844 526975 80a7f a.out
It depends on the linker and how the library was built. Usually libraries are a combination of object files (import libraries are a major exception to this). Older linkers would pull things into the output file image at a granularity of the object files that were put into the library. So if function a() and function b() were both in the same object file, they would both be in the output file - even if only one of the 2 functions were actually referenced.
This is a reason why you'll often see library-oriented projects with a policy of a single C function per source file. That way each function is packaged in its own object file and linkers have no problem pulling in only what is referenced.
Note however that newer linkers (certainly newer Microsoft linkers) have the ability to pull in only parts of object files that are referenced, so there's less of a need today to enforce a one-function-per-source-file policy - though there are reasonable arguments that that should be done anyway for maintainability.

Resources