Strange compiler speed optimization results - IAR compiler - c

I'm experiencing a strange issue when I try to compile two source files that contain some important computing algorithms that need to be highly optimized for speed.
Initially, I have two source files, let's call them A.c and B.c, each containing multiple functions that call each other (functions from a file may call functions from the other file). I compile both files with full speed optimizations and then when I run the main algorithm in an application, it takes 900 ms to run.
Then I notice the functions from the two files are mixed up from a logical point of view, so I move some functions from A.c to B.c; let's call the new files A2.c and B2.c. I also update the two headers A.h and B.h by moving the corresponding declarations.
Moving function definitions from one file to the other is the only modification I make!
The strange result is that after I compile the two files again with the same optimizations, the algorithm now takes 1000 ms to run.
What is going on here?
What I suspect happens: when functions f calls function g, being in the same file allows the compiler to replace actual function calls with inline code as an optimization. This is no longer possible when definitions are not compiled at the same time.
Am I correct in my assumption?
Aside from regrouping the function definitions as it was before, is there anything I can do to obtain the same optimization as before? I researched and it seems it's not possible to compile two source files simultaneously into a single object file. Could the order of compilation matter?

As to whether your assumption is correct, the best way to tell is to examine the assembler output, such as by using gcc -S or gcc -save-temps. That will be the definitive way to see what your compiler has done.
As to compiling two C source files into a single object file, that's certainly doable. Just create a AB.c as follows:
#include "A.c"
#include "B.c"
and compile that.
Barring things that should be kept separate (such as static items which may exist in both C files), that should work (or at least work with a little modification).
However, remember the optimisation mantra: Measure, don't guess! You're giving up a fair bit of encapsulation by combining them so make sure the benefits well outweigh the costs.

Related

How to compile a normal .h-.c object build and get the same level of optimization as with a static "unity" build, in gcc?

I have been told that "unity builds" have a greater chance to inline everything if you make all the functions static, and thus make the binary more optimized and faster.
Personally I don't like them because the classic way is much more intuitive and modular, and you don't have to keep track of headers between branching .c files and main.c, and you don't have to have a master declaration header (basically emulating the normal way).
I don't care about compilation time, but I do care about efficiency of the program. So in my mind, the question is why wouldn't a compiler be able to do all these optimizations regardless of objects and whatnot, even if it had to compile twice or several times?
So how do I do that?

Why not concatenate C source files before compilation? [duplicate]

This question already has answers here:
#include all .cpp files into a single compilation unit?
(6 answers)
The benefits / disadvantages of unity builds? [duplicate]
(3 answers)
Closed 6 years ago.
I come from a scripting background and the preprocessor in C has always seemed ugly to me. None the less I have embraced it as I learn to write small C programs. I am only really using the preprocessor for including the standard libraries and header files I have written for my own functions.
My question is why don't C programmers just skip all the includes and simply concatenate their C source files and then compile it? If you put all of your includes in one place you would only have to define what you need once, rather than in all your source files.
Here's an example of what I'm describing. Here I have three files:
// includes.c
#include <stdio.h>
// main.c
int main() {
foo();
printf("world\n");
return 0;
}
// foo.c
void foo() {
printf("Hello ");
}
By doing something like cat *.c > to_compile.c && gcc -o myprogram to_compile.c in my Makefile I can reduce the amount of code I write.
This means that I don't have to write a header file for each function I create (because they're already in the main source file) and it also means I don't have to include the standard libraries in each file I create. This seems like a great idea to me!
However I realise that C is a very mature programming language and I'm imagining that someone else a lot smarter than me has already had this idea and decided not to use it. Why not?
Some software are built that way.
A typical example is SQLite. It is sometimes compiled as an amalgamation (done at build time from many source files).
But that approach has pros and cons.
Obviously, the compile time will increase by quite a lot. So it is practical only if you compile that stuff rarely.
Perhaps, the compiler might optimize a bit more. But with link time optimizations (e.g. if using a recent GCC, compile and link with gcc -flto -O2) you can get the same effect (of course, at the expense of increased build time).
I don't have to write a header file for each function
That is a wrong approach (of having one header file per function). For a single-person project (of less than a hundred thousand lines of code, a.k.a. KLOC = kilo line of code), it is quite reasonable -at least for small projects- to have a single common header file (which you could pre-compile if using GCC), which will contain declarations of all public functions and types, and perhaps definitions of static inline functions (those small enough and called frequently enough to profit from inlining). For example, the sash shell is organized that way (and so is the lout formatter, with 52 KLOC).
You might also have a few header files, and perhaps have some single "grouping" header which #include-s all of them (and which you could pre-compile). See for example jansson (which actually has a single public header file) and GTK (which has lots of internal headers, but most applications using it have just one #include <gtk/gtk.h> which in turn include all the internal headers). On the opposite side, POSIX has a big lot of header files, and it documents which ones should be included and in which order.
Some people prefer to have a lot of header files (and some even favor putting a single function declaration in its own header). I don't (for personal projects, or small projects on which only two or three persons would commit code), but it is a matter of taste. BTW, when a project grows a lot, it happens quite often that the set of header files (and of translation units) changes significantly. Look also into REDIS (it has 139 .h header files and 214 .c files i.e. translation units totalizing 126 KLOC).
Having one or several translation units is also a matter of taste (and of convenience and habits and conventions). My preference is to have source files (that is translation units) which are not too small, typically several thousand lines each, and often have (for a small project of less than 60 KLOC) a common single header file. Don't forget to use some build automation tool like GNU make (often with a parallel build through make -j; then you'll have several compilation processes running concurrently). The advantage of having such a source file organization is that compilation is reasonably quick. BTW, in some cases a metaprogramming approach is worthwhile: some of your (internal header, or translation units) C "source" files could be generated by something else (e.g. some script in AWK, some specialized C program like bison or your own thing).
Remember that C was designed in the 1970s, for computers much smaller and slower than your favorite laptop today (typically, memory was at that time a megabyte at most, or even a few hundred kilobytes, and the computer was at least a thousand times slower than your mobile phone today).
I strongly suggest to study the source code and build some existing free software projects (e.g. those on GitHub or SourceForge or your favorite Linux distribution). You'll learn that they are different approaches. Remember that in C conventions and habits matter a lot in practice, so there are different ways to organize your project in .c and .h files. Read about the C preprocessor.
It also means I don't have to include the standard libraries in each file I create
You include header files, not libraries (but you should link libraries). But you could include them in each .c files (and many projects are doing that), or you could include them in one single header and pre-compile that header, or you could have a dozen of headers and include them after system headers in each compilation unit. YMMV. Notice that preprocessing time is quick on today's computers (at least, when you ask the compiler to optimize, since optimizations takes more time than parsing & preprocessing).
Notice that what goes into some #include-d file is conventional (and is not defined by the C specification). Some programs have some of their code in some such file (which should then not be called a "header", just some "included file"; and which then should not have a .h suffix, but something else like .inc). Look for example into XPM files. At the other extreme, you might in principle not have any of your own header files (you still need header files from the implementation, like <stdio.h> or <dlfcn.h> from your POSIX system) and copy and paste duplicated code in your .c files -e.g. have the line int foo(void); in every .c file, but that is very bad practice and is frowned upon. However, some programs are generating C files sharing some common content.
BTW, C or C++14 do not have modules (like OCaml has). In other words, in C a module is mostly a convention.
(notice that having many thousands of very small .h and .c files of only a few dozen lines each may slow down your build time dramatically; having hundreds of files of a few hundred lines each is more reasonable, in term of build time.)
If you begin to work on a single-person project in C, I would suggest to first have one header file (and pre-compile it) and several .c translation units. In practice, you'll change .c files much more often than .h ones. Once you have more than 10 KLOC you might refactor that into several header files. Such a refactoring is tricky to design, but easy to do (just a lot of copy&pasting chunk of codes). Other people would have different suggestions and hints (and that is ok!). But don't forget to enable all warnings and debug information when compiling (so compile with gcc -Wall -g, perhaps setting CFLAGS= -Wall -g in your Makefile). Use the gdb debugger (and valgrind...). Ask for optimizations (-O2) when you benchmark an already-debugged program. Also use a version control system like Git.
On the contrary, if you are designing a larger project on which several persons would work, it could be better to have several files -even several header files- (intuitively, each file has a single person mainly responsible for it, with others making minor contributions to that file).
In a comment, you add:
I'm talking about writing my code in lots of different files but using a Makefile to concatenate them
I don't see why that would be useful (except in very weird cases). It is much better (and very usual and common practice) to compile each translation unit (e.g. each .c file) into its object file (a .o ELF file on Linux) and link them later. This is easy with make (in practice, when you'll change only one .c file e.g. to fix a bug, only that file gets compiled and the incremental build is really quick), and you can ask it to compile object files in parallel using make -j (and then your build goes really fast on your multi-core processor).
You could do that, but we like to separate C programs into separate translation units, chiefly because:
It speeds up builds. You only need to rebuild the files that have changed, and those can be linked with other compiled files to form the final program.
The C standard library consists of pre-compiled components. Would you really want to have to recompile all that?
It's easier to collaborate with other programmers if the code base is split up into different files.
Your approach of concatenating .c files is completely broken:
Even though the command cat *.c > to_compile.c will put all functions into a single file, order matters: You must have each function declared before its first use.
That is, you have dependencies between your .c files which force a certain order. If your concatenation command fails to honor this order, you won't be able to compile the result.
Also, if you have two functions that recursively use each other, there is absolutely no way around writing a forward declaration for at least one of the two. You may as well put those forward declarations into a header file where people expect to find them.
When you concatenate everything into a single file, you force a full rebuild whenever a single line in your project changes.
With the classic .c/.h split compilation approach, a change in the implementation of a function necessitates recompilation of exactly one file, while a change in a header necessitates recompilation of the files that actually include this header. This can easily speed up the rebuild after a small change by a factor of 100 or more (depending on the count of .c files).
You loose all the ability for parallel compilation when you concatenate everything into a single file.
Have a big fat 12 core processor with hyper-threading enabled? Pity, your concatenated source file is compiled by a single thread. You just lost a speedup of a factor greater than 20... Ok, this is an extreme example, but I have build software with make -j16 already, and I tell you, it can make a huge difference.
Compilation times are generally not linear.
Usually compilers contain at least some algorithms that have a quadratic runtime behavior. Consequently, there is usually some threshold from which on aggregated compilation is actually slower than compilation of the independent parts.
Obviously, the precise location of this threshold depends on the compiler and the optimization flags you pass to it, but I have seen a compiler take over half an hour on a single huge source file. You don't want to have such an obstacle in your change-compile-test loop.
Make no mistake: Even though it comes with all these problems, there are people who use .c file concatenation in practice, and some C++ programmers get pretty much to the same point by moving everything into templates (so that the implementation is found in the .hpp file and there is no associated .cpp file), letting the preprocessor do the concatenation. I fail to see how they can ignore these problems, but they do.
Also note, that many of these problems only become apparent with larger project sizes. If your project is less than 5000 lines of code, it's still relatively irrelevant how you compile it. But when you have more than 50000 lines of code, you definitely want a build system that supports incremental and parallel builds. Otherwise, you are wasting your working time.
With modularity, you can share your library without sharing the code.
For large projects, if you change a single file, you would end up
compiling the complete project.
You may run out of memory more easily when you attempt to compile large projects.
You may have circular dependencies in modules, modularity helps in maintaining those.
There may be some gains in your approach, but for languages like C, compiling each module makes more sense.
Because splitting things up is good program design. Good program design is all about modularity, autonomous code modules, and code re-usability. As it turns out, common sense will get you very far when doing program design: Things that don't belong together shouldn't be placed together.
Placing non-related code in different translation units means that you can localize the scope of variables and functions as much as possible.
Merging things together creates tight coupling, meaning awkward dependencies between code files that really shouldn't even have to know about each other's existence. This is why a "global.h" which contains all the includes in a project is a bad thing, because it creates a tight coupling between every non-related file in your whole project.
Suppose you are writing firmware to control a car. One module in the program controls the car FM radio. Then you re-use the radio code in another project, to control the FM radio in a smart phone. And then your radio code won't compile because it can't find brakes, wheels, gears, etc. Things that doesn't make the slightest sense for the FM radio, let alone the smart phone to know about.
What's even worse is that if you have tight coupling, bugs escalate throughout the whole program, instead of staying local to the module where the bug is located. This makes the bug consequences far more severe. You write a bug in your FM radio code and then suddenly the brakes of the car stop working. Even though you haven't touched the brake code with your update that contained the bug.
If a bug in one module breaks completely non-related things, it is almost certainly because of poor program design. And a certain way to achieve poor program design is to merge everything in your project together into one big blob.
Header files should define interfaces - that's a desirable convention to follow. They aren't meant to declare everything that's in a corresponding .c file, or a group of .c files. Instead, they declare all functionality in the .c file(s) that is available to their users. A well designed .h file comprises a basic document of the interface exposed by the code in the .c file even if there isn't a single comment in it. One way to approach the design of a C module is to write the header file first, and then implement it in one or more .c files.
Corollary: functions and data structures internal to the implementation of a .c file don't normally belong in the header file. You might need forward declarations, but those should be local and all variables and functions thus declared and defined should be static: if they are not a part of the interface, the linker shouldn't see them.
While you can still write your program in a modular way and build it as a single translation unit, you will miss all the mechanisms C provides to enforce that modularity. With multiple translation units you have fine control on your modules' interfaces by using e.g. extern and static keywords.
By merging your code into a single translation unit, you will miss any modularity issues you might have because the compiler won't warn you about them. In a big project this will eventually result in unintended dependencies spreading around. In the end, you will have trouble changing any module without creating global side-effects in other modules.
The main reason is compilation time. Compiling one small file when you change it may take a short amount of time. If you would however compile the whole project whenever you change single line, then you would compile - for example - 10,000 files each time, which could take a lot longer.
If you have - as in the example above - 10,000 source files and compiling one takes 10 ms, then the whole project builds incrementally (after changing single file) either in (10 ms + linking time) if you compile just this changed file, or (10 ms * 10000 + short linking time) if you compile everything as a single concatenated blob.
If you put all of your includes in one place you would only have to define what you need once, rather than in all your source files.
That's the purpose of .h files, so you can define what you need once and include it everywhere. Some projects even have an everything.h header that includes every individual .h file. So, your pro can be achieved with separate .c files as well.
This means that I don't have to write a header file for each function I create [...]
You're not supposed to write one header file for every function anyway. You're supposed to have one header file for a set of related functions. So your con is not valid either.
This means that I don't have to write a header file for each function I create (because they're already in the main source file) and it also means I don't have to include the standard libraries in each file I create. This seems like a great idea to me!
The pros you noticed are actually a reason why this is sometimes done in a smaller scale.
For large programs, it's impractical. Like other good answers mentioned, this can increase build times substantially.
However, it can be used to break up a translation unit into smaller bits, which share access to functions in a way reminiscent of Java's package accessibility.
The way the above is achieved involves some discipline and help from the preprocessor.
For example, you can break your translation unit into two files:
// a.c
static void utility() {
}
static void a_func() {
utility();
}
// b.c
static void b_func() {
utility();
}
Now you add a file for your translation unit:
// ab.c
static void utility();
#include "a.c"
#include "b.c"
And your build system doesn't build either a.c or b.c, but instead builds only ab.o out of ab.c.
What does ab.c accomplish?
It includes both files to generate a single translation unit, and provides a prototype for the utility. So that the code in both a.c and b.c could see it, regardless of the order in which they are included, and without requiring the function to be extern.

Splitting code into files and O flags

When writing programs with code that can be executed in parallel in C, we definitely use the O flags to optimize the code.
gcc -Olevel [options] [source files] [object files] [-o output file]
In large projects, we usually split the code into several files. My question, for which I've found no answer, is this:
Does the program's performance drop at all, due to the fact that we split the code into files and the O flags don't have enough information to optimize any further? Is there such a possibility?
When you break code into separate files, it could potentially split it into more than one translation unit, which the compiler generally can't optimize across.
Take for example a constant defined in one translation unit but referenced in a number of others. All of the calculations that reference the constant have to be performed at run-time since the constant can't be folded into them at compile time.
Link-time optimization (-flto) is one way around the limitation.
Single Unit Optimization
Just to complement on #Jason's answer, I'd like to post another technique to avoid the limitation that arises when splitting files.
It's called Single Unit Optimization:
The Single Compilation Unit technique uses pre-processor directives to "glue" different translation units together at compile time rather than at link time. This reduces the overall build time, due to eliminating the duplication, but increases the incremental build time (the time required after making a change to any single source file that is included in the Single Compilation Unit), due to requiring a full rebuild of the entire unit if any single input file changes.
The whole project, even when split in files, can be optimized as if all parts of the program were visible to the compiler at once, without requiring the user merges the files back again.
How to apply it?
Usually, the project would contain a file with a main and will include all header files of each split file:
main.c
#include "sub-program-1.h"
#include "sub-program-2.h"
...
#include "sub-program-n.h"
//rest of code
where each of those .h files correspond to its respective .c which is compiled on its own (possibly through a makefile).
In order to apply SCU, we remove the include I've mentioned above and instead create a new file (let's call it SCU.c). This would be the following.
SCU.c
#include "sub-program-1.c"
#include "sub-program-2.c"
...
#include "sub-program-3.c"
#include "main.c"
//no more code in this file
And to compile the whole project, we just compile SCU.c

Why use object files in C?

When I compile a C program, for ease I've been including the source file for a certain header at the end. So, if main.c includes util.h, util.h will have all the headers util.c will use, outlines types or structs, etc, then at the very end it include util.c. Then, when I compile I only have to use gcc main.c -o main, and the rest is all taken care of.
I've been looking up C coding standards, trying to figure out what the best way to do things is, and there are just so many, and so many conflicting opinions I don't know what to think. Why do so many places reccomend compiling object files individually instead of including all of them in a web? util never touches anything but util.c, so the two are perfectly independent, and in theory (my theory) it would be fine, but I'm probably wrong since this is computer science and people are wrong even when they're right, so if I'm already wrong I'm probably wrong.
Some people say header files should ONLY be prototypes, and the source file be the one that includes it, and it's necessary system headers. From purely as aesthetic point of view I much prefer having all the info (types, system headers used, prototypes) in the header (in this case util.h) and having ONLY function code in util.c (excluding one "#include "util.h"" at the very top).
I guess the point I'm getting at is, with all this stuff that works, selecting a method sounds arbitrary to someone who doesn't understand the background (me). Please tell me why and what.
While your program is small, this will work. At some point, however, your program will get large enough that recompiling the whole program every time you change one line is a pain in the rear.
This -- even more than avoiding editing huge files -- is the reason to split up your program. If main.c and util.c are seperately compiled into object files, changing one line in a function in main.c will no longer require you to recompile all the code in util.c.
By the time your program is made up of a few dozen files, this will be a big win.
I think the point is that you want to include only what is needed for that file to be independent. This reduces overall compilation times by allowing the compiler to only read the headers that are necessary rather repeatedly reading every header when it might not need to. For example, if your util.c method utilises functions and/or types in <stdio.h> but your util.h doesn't, then you would want to include <stdio.h> only in util.c so that when the compiler compiles util.c it only then includes <stdio.h>, but if you include <stdio.h> in your util.h instead, then every source file that includes util.h is also including <stdio.h> whether it needs it or not.
This is very negligible for small projects with only a handful of files, but proper header inclusion can affect compilation times for larger projects.
With regards to the question about "object files": when you compile a source file into an object file, you create a shortcut that allows a build system to only recompile the source files that have outdated object files. This is an effective way to significantly reduce compilation times especially for large projects.
First, including a .c file from a .h file is completely bass-ackwards.
The "standard" way of doing it follows a line of thought roughly like this:
You have a library, containing dozens of functions. Keeping everything in one big source file means that anyone using your library would have to link the whole library, even if he uses only a single function of it. (Imagine linking the whole C standard library for a puts( "Hello" ).)
So you split things across multiple source files, which are compiled individually. Whenever you make changes to one of your functions, you have to re-translate only one small source file and update the library archive (or executable) - instead of re-translating the whole thing every time. (This is still an issue, because code sizes have somewhat kept up with CPU improvements. Compiling something like the Boost lib can still take several minutes on not-too-fancy hardware...)
Now you are in a pinch, however. The function is defined inside the .c file, and the corresponding .o file can conveniently be linked (via a .a archive if need be). However, to actually address the function (provided by the .o file) properly from another source file (a.k.a. "translation unit"), your compiler needs to know the function name, its parameter list, and its return type. This is why the declaration of the function (i.e., the function head without its body) is put in a separate header (.h) file.
Other source files can now #include the header file, address the function properly (without the compiler being aware of what the function actually does), and when all parts of your library / program are compiled into .o files, then everything is linked together.
The source file includes its own header basically to make sure the two files agree on the function declaration. ;-)
That's about it, as far as I can be bothered to write it up right now. Putting everything into one monolithic source file is barely acceptable (actually, no, it isn't, not for anything beyond about 200 lines), but including the .c file at the end of the .h file either means you learned your C coding by looking at god-awful code instead of a good book, or whoever tutored you should never tutor another person on C coding in his life. No offense intended. ;-)
PS: Header files also provide a good summary / oversight of a piece of code. Languages that don't provide headers - Java, for example - need IDE's or documentation tools to extract this kind of information. Personally, I found header files to be a benefit, not a liability.
Please use *.h and *.c files as customary: *.h files are #included in *.c files; *.h contain only macro definitions, data type declarations, function declarations, and extern data declarations. All definitions are in *.c files. That is how everybody else organizes C programs, do your fellow humans (who some day might need to understand your program) a favor. If something in file.c is used outside, you'd write file.h containing the declarations of whatever in that file is to be used outside, and include that in file.c (to check that declarations and definitions agree) and in all using *.c files. If a bunch of *.h are always included together, it might mean that the splitup into *.c isn't right (or at least that of the *.h; perhaps you should make one .h including all those declarations, and creating *.h for internal use where needed among the group of related *.c files).
[If a program written as you outline crosses my path, I can assure you I'll avoid it like the plague. The extra obfuscation might be wellcome in IOCCC, but not by me. It is a sure sign of somebody who doesn't know how to organize a program cleanly, and so the program probably isn't worth trying it out.]
Re: Separate compilation: You break up a C program so the pieces are easier to understand, you can hide details of how things work in the C files (think static), this provides support for Parnas' modularity. It also means that if you change a file, you don't have to recompile everything.
Re: Differing C programming standards: Yes, there are lots of them around. Pick one you feel confortable with, and stick to that. If you work on a project, adhere to their standards.
The "include in a single translation unit" approach becomes very inefficient for any significantly sized project, it is impractical for projects that are distributed amongst multiple developers.
Morover when creating static libraries, if everything in the library were from a single translation unit, any code linked to it would get all the library code regardless of whether it is referenced or not.
A project using a build manager such as make or the features available in most IDEs uses header file dependencies to allow an incremental build; only compiling those sources that are modified or dependent on modified files. The dependencies are determined by the file inclusions, so minimising redundant dependencies speeds build time.
A typical commercial project can comprise hundreds of thousands of lines of code and a few hundred source files; full rebuild times can vary from minutes to hours. If in your development cycle you have to wait that long between code changes and test, productivity would be very low!

Single Source Code vs Multiple Files + Libraries

How much effect does having multiple files or compiled libraries vs. throwing everything (>10,000 LOC) into one source have on the final binary? For example, instead of linking a Boost library separately, I paste its code, along with my original source, into one giant file for compilation. And along the same line, instead of feeding several files into gcc, pasting them all together, and giving only that one file.
I'm interested in the optimization differences, instead of problems (horror) that would come with maintaining a single source file of gargantuan proportions.
Granted, there can only be link-time optimization (I may be wrong), but is there a lot of difference between optimization possibilities?
If the compiler can see all source code, it can optimize better if your compiler has some kind of Interprocedural Optimization (IPO) option turned on. IPO differs from other compiler optimization because it analyzes the entire program; other optimizations look at only a single function, or even a single block of code
Here is some interprocedural optimization that can be done, see here for more:
Inlining
Constant propagation
mod/ref analysis
Alias analysis
Forward substitution
Routine key-attribute propagation
Partial dead call elimination
Symbol table data promotion
Dead function elimination
Whole program analysis
GCC supports this kind of optimization.
This interprocedural optimization can be used to analyze and optimize the function being called.
If compiler can not see the source code of the library function, it cannot do such optimization.
Note that some modern compilers (clang/LLVM, icc and recently even gcc) now support link-time-optimization (LTO) to minimize the effect of separate compilation. Thus you gain the benefits of separate compilation (maintenance, faster compilation, etc.) and these of whole program analysis.
By the way, it seems like gcc has supported -fwhole-program and --combine since version 4.1. You have to pass all source files together, though.
Finally, since BOOST is mostly header files (templates) that are #included, you cannot gain anything from adding these to your source code.

Resources