Can all compile-time optimizations be done with link-time optimization? - c

Or are there some optimizations that can only be done at compile time (and therefore only work within compilation units)? I ask because in C, the compilation unit is a source file, and I'm trying to understand if there is any reason not to split source code into separate files in some circumstances (e.g. an optimization that could have been done if all the source were in one file was not done).

A typical (simplified) compile might look like
1) Pre-process
2) Parse code to internal representation
3) Optimize code
4) Emit assembly language
5) Assemble to .o file
6) Link .o file to a.out
LTOs are typically achieved by dumping the internal compiler representation to disk between steps 2 and 3, then during the final link (step 6) going back and performing steps 3-5. This could be depending on the compiler and version however. If it follows this pattern then you would see LTO equivalent to Compile Time optimizations.
However ...
Having very large source files can be annoying -- Emacs starts to choke on source files >10MB.
If you are in a multi-user development environment, depending on your SCM you may have a lot of trouble if multiple engineers are working on the same file.
If you use a distributed build system you perform compiles in parallel. So if it takes 1 second each to compile and optimize a file, and you have 1000 files and 1000 build agents, your total compile time is 1 second. If you are doing all your optimization for all 1000 files during the final you will have 999 agents sitting idle and 1 agent spend an eternity doing all your optimization.

academic example:
main()
{
int i;
for (i = 0; i < MAX; i++) {
fun(i);
}
}
fun(int i)
{
if (i == 0) {
doSomething();
}
}
if fun is in the same compilation unit, and data-flow-analys is enabled, the foor-loop could be optimized to a single function call.
BUT: I would stay with MooseBoys' comment.

Related

can I edit lines of code using gdb and is it also possible to save to actual source file and header file while in same debug session? linux

I have this program called parser I compiled with -g flag this is my makefile
parser: header.h parser.c
gcc -g header.h parser.c -o parser
clean:
rm -f parser a.out
code for one function in parser.c is
int _find(char *html , struct html_tag **obj)
{
char temp[strlen("<end")+1];
memcpy(temp,"<end",strlen("<end")+1);
...
...
.
return 0;
}
What I like to see when I debug the parser or something can I also have the capability to change the lines of code after hitting breakpoint and while n through the code of above function. If its not the job of gdb then is there any opensource solution to actually changing code and possible saving so when I run through the next statement in code then changed statement before doing n (possible different index of array) will execute, is there any opensource tool or can it be done in gdb do I need to do some compiling options.
I know I can assign values to variables at runtime in gdb but is this it? like is there any thing like actually also being capable of changing soure
Most C implementations are compiled. The source code is analyzed and translated to processor instructions. This translation would be difficult to do on a piecewise basis. That is, given some small change in the source code, it would be practically impossible to update the executable file to represent those changes. As part of the translation, the compiler transforms and intertwines statements, assigns processor registers to be used for computing parts of expressions, designates places in memory to hold data, and more. When source code is changed slightly, this may result in a new compilation happening to use a different register in one place or needing more or less memory in a particular function, which results in data moving back or forth. Merging these changes into the running program would require figuring out all the differences, moving things in memory, rearranging what is in what processor register, and so on. For practical purposes, these changes are impossible.
GDB does not support this.
(Appleā€™s developer tools may have some feature like this. I saw it demonstrated for the Swift programming language but have not used it.)

Optimization: Faster compilation

Separating a program into header and source files perhaps might benefit in faster compilation if given to a smart compilation manager, which is what I am working on.
Will on theory work:
Creating a thread for each source file and
compiling each source file into object file at once.
Then link those object files together.
It still needs to wait for the source file being the slowest.
This shouldn't be a problem as a simple n != nSources counter can be implemented that increments for each .o generated.
I don't think GCC on default does that. When it invokes the assembler
it should parse the files one by one.
Is this a valid approach and how could I optimize compilation time even further?
All modern (as in post 2000-ish) make's offer this feature. Both GNU make and the various flavours of BSD make will compile source files in separate threads with the -j flag. It just requires that you have a makefile, of course. Ninja also does this by default. It vastly speeds up compilation.

C - Header Files versus Functions

What are the pros and cons of shoving everything in one file:
void function(void) {
code...
}
Versus creating a completely new file for functions:
#include <stdio.h>
#include "header.h"
Is one or the other faster? More lightweight? I am in a situation where speed is necessary and portability is a must.
Might I add this is all based on C.
If you care about speed, you first should write a correct program, care about efficient algorithms (read Introduction to Algorithms), benchmark & profile it (perhaps using gprof and/or oprofile), and focus your efforts mostly on the few percents of source code which are critical to performance.
You'll better define these small critical functions in common included header files as static inline functions. The compiler would then be able to inline every call to them if it wants to (and it needs access to the definition of the function to inline).
In general small inlined functions would often run faster, because there is no call overhead in the compiled machine code; sometimes, it might perhaps go slightly slower, because inlining increases machine code size which is detrimental to CPU cache efficiency (read about locality of reference). Also a header file with many static inline functions needs more time to be compiled.
As a concrete example, my Linux system has a header /usr/include/glib-2.0/glib/gstring.h (from Glib in GTK) containing
/* -- optimize g_string_append_c --- */
#ifdef G_CAN_INLINE
static inline GString*
g_string_append_c_inline (GString *gstring,
gchar c)
{
if (gstring->len + 1 < gstring->allocated_len)
{
gstring->str[gstring->len++] = c;
gstring->str[gstring->len] = 0;
}
else
g_string_insert_c (gstring, -1, c);
return gstring;
}
#define g_string_append_c(gstr,c) g_string_append_c_inline (gstr, c)
#endif /* G_CAN_INLINE */
The G_CAN_INLINE preprocessor flag would have been enabled by some previously included header file.
It is a good example of inline function: it is short (a dozen of lines), it would run quickly its own code (excluding the time to call to g_string_insert_c), so it is worth to be defined as static inline.
It is not worth defining as inline a short function which runs by itself a significant time. There is no point inlining a matrix multiplication for example (the call overhead is insignificant w.r.t. the time to make a 100x100 or 8x8 matrix multiplication). So choose carefully the functions you want to inline.
You should trust the compiler, and enable its optimizations (in particular when benchmarking or profiling). For GCC, that would mean compiling with gcc -O3 -mcpu=native (and I also recommend -Wall -Wextra to get useful warnings). You might use link time optimizations by compiling and linking with gcc -flto -O3 -mcpu=native
You need to be clear about the concepts of header files, translation units and separate compilation.
The #include directive does nothing more than insert the content of the included file at the point of inclusion as if it were all one file, so in that sense placing content into a header file has no semantic or performance difference than "shoving everything in one file".
The point is that is not how header files should be used or what they are intended for; you will quickly run into linker errors and/or code bloat on anything other than the most trivial programs. A header file should generally contain only declarative code not definitive code. Take a look inside the standard headers for example - you will find no function definitions, only declarations (there may be some interfaces defined as macros or possibly since C99, inline functions, but that is a different issue).
What header-files provide is a means to support separate compilation and linking of code in separate translation units. A translation unit is a source file (.c in this case) with all it's #include'ed and #define'ed etc. content expanded by the pre-processor before actual compilation.
When the compiler builds a translation unit, there will be unresolved links to external code declared in headers. These declarations are a promise to the compiler that there is an interface of the form declared that is defined elsewhere and will be resolved by the linker.
The conventional form (although there are few restrictions to stop you from dong unconventional or foolish things) of a multiple module C program source is as follows:
main.c
#include foobar.h
int main( void )
{
int x = foo() ;
bar( x ) ;
return 0 ;
}
foobar.h
#if !defined foobar_INCLUDE
#define foobar_INCLUDE
int foo( void ) ;
void bar( int x ) ;
#endif
Note the use of the pre-processor here to prevent multiple declarations when a file is included more than once which can happen in complex code bases with nested includes for example. All your headers should have such "include guards" - some compilers support #pragma once to do the same thing, but it is less portable.
foobar.c
#include "foobar.h"
int foo( void )
{
int x = 0 ;
// do something
return x ;
}
void bar( int x )
{
// do something
}
Then main.c and foobar.c (and any other modules) are separately compiled and then linked, the linker also resolves references to library interfaces provided by the standard library or any other external libraries. A library in this sense is simply a collection of previously separately compiled object code.
Now that is perhaps clear, to answer your question but re-present it as the pros and cons of separate compilation and linking the benefits are:
code reuse - you build your own libraries of useful routines that can be reused in many projects without erroneous copy & pasting.
Build time reduction - on a non-trivial application the separate compilation and linking would be managed by a build manager such as make or an IDE such as Ecipse or Visual Studio; these tools perform incremental builds compiling only those modules for which the source or one of it's header dependencies have been modified. This means you are not compiling all the code all the time so turn-around during debugging and testing is much faster.
Development team scalability - if all your code is in one file, it becomes almost impractical to have multiple developers working on the same project at once. If you want to work with others either on open-source projects or as a career (the two are not necessarily mutually exclusive of course), you really cannot consider the all-in-one approach. Not least because your fellow developers will not take toy seriously if that is your practice.
Specifically separate compilation and linking has zero impact on performance or code size under normal circumstances. There is possibly an impact on the ability of the compiler to optimise in some cases when it cannot see all of the code at one time, but if your code is carefully partitioned according to the principles of high cohesion and minimal coupling this potential loss of opportunity is probably insignificant. Moreover modern linkers are able to perform some cross-module optimisations such as unused code removal in any case.
Its not a question of which one is "faster". Header files are custom created when you have a function or functions which you'd want to use in a lot of other places or in other projects. For example, if you've written a function to calculate the factorial of a number and you'd want to use that function in other programs (or you find that you'd have to replicate the same code in other programs as well) then instead of writing the function in the other programs, it'll be more convenient if you'd put it in a header file. Generally, a header file contains functions which are relevant to a certain subject (like math.h contains functions for mathematical calculations and not for string processing).

Strange compiler speed optimization results - IAR compiler

I'm experiencing a strange issue when I try to compile two source files that contain some important computing algorithms that need to be highly optimized for speed.
Initially, I have two source files, let's call them A.c and B.c, each containing multiple functions that call each other (functions from a file may call functions from the other file). I compile both files with full speed optimizations and then when I run the main algorithm in an application, it takes 900 ms to run.
Then I notice the functions from the two files are mixed up from a logical point of view, so I move some functions from A.c to B.c; let's call the new files A2.c and B2.c. I also update the two headers A.h and B.h by moving the corresponding declarations.
Moving function definitions from one file to the other is the only modification I make!
The strange result is that after I compile the two files again with the same optimizations, the algorithm now takes 1000 ms to run.
What is going on here?
What I suspect happens: when functions f calls function g, being in the same file allows the compiler to replace actual function calls with inline code as an optimization. This is no longer possible when definitions are not compiled at the same time.
Am I correct in my assumption?
Aside from regrouping the function definitions as it was before, is there anything I can do to obtain the same optimization as before? I researched and it seems it's not possible to compile two source files simultaneously into a single object file. Could the order of compilation matter?
As to whether your assumption is correct, the best way to tell is to examine the assembler output, such as by using gcc -S or gcc -save-temps. That will be the definitive way to see what your compiler has done.
As to compiling two C source files into a single object file, that's certainly doable. Just create a AB.c as follows:
#include "A.c"
#include "B.c"
and compile that.
Barring things that should be kept separate (such as static items which may exist in both C files), that should work (or at least work with a little modification).
However, remember the optimisation mantra: Measure, don't guess! You're giving up a fair bit of encapsulation by combining them so make sure the benefits well outweigh the costs.

Is there a difference in a binary when using multiple files C as opposed to putting it all into a single file?

I know that multiple files will by far make code easier. However do they offer a performance difference between "jamming it all into one file" or will a modern compiler like gcc create the same binaries for both. When I say performance difference I mean file size, compile time, and running time.
This is for C only.
Arguably, compile times improve with multiple files, as you only need to recompile files that have changed (assuming you have a decent dependency-tracking build system).
Linking would probably take longer, as there's just more to do.
Traditionally, compilers have been unable to perform optimizations across multiple source files (things like inlining functions is tricky). So the resulting executable is likely to be different, and potentially slower.
There are more opportunities for optimization when everything is in a single file. E.g. gcc, starting with -O2, will inline some functions if their body is available, even if they aren't declared inline (even more functions are eligible for inlining with -O3). So there are differences in run time, and sometimes you even have a chance to notice them. Even more so with -fwhole-program, telling GCC that you don't care about out-of-line versions of external functions except main() (GCC behaves as if all your external functions became static).
Overall compile time may increase (because there is more stuff to analyze, and not all optimizer algorithms are linear) or decrease (when there's no need to parse the same headers multiple times). Binary size may increase (due to inlining, in exchange for running faster) or decrease (less likely; but sometimes, inlining simplifies caller's code to the point where code size decreases).
As of the ease of development and maintenance, you can use sqlite's approach: it has multiple source files, but they are jammed into one ("amalgamation") before compilation.
From some tests, compiling and linking take longer. You will receive a different binary, at least I did, however mine was within a byte of the other.
The all-in-one file ran in .000764 MS
The Multiple files version ran in .000769 MS
Do take the benchmark with a grain of salt, as I did put it together in about 5 minutes, and it was a tiny program.
So really no differences overall.

Resources