inlining C code : -flto or not -flto - c

One of my recent program highly depends on inlining a few "hot" functions for performance. These hot functions are part of an external .c file which I would prefer not to change.
Unfortunately, while Visual is pretty good at this exercise, gcc and clang are not. Apparently, due to the fact that the hot functions are within a different .c, they can't inline them.
This leaves me with 2 options :
Either include directly the relevant code into the target file. In practice, that means #include "perf.c" instead of #include "perf.h". Trivial change but it looks ugly. Clearly it works. It's just a little bit more complex to explain to the build chain that perf.c must be there but not be compiled nor linked.
Use -flto, for Link Time Optimisation. It looks cleaner, and is what Visual achieves by default.
The problem is, with -flto, gcc linking stage generates multiple warnings, which seem to be internal bugs (they refer to portion of code from within the standard libs, so I have little control over them). This is embarrassing when targeting a "zero warning" policy (even though the binary generated is perfectly fine).
As to clang, it just fails with -flto, due to packaging error (error loading plugin: LLVMgold.so) which is apparently very common accross multiple linux distros.
2 questions :
Is there a way to turn off these warning messages when using -flto on gcc ?
Which of the 2 methods described above methods seems the better one, given pro and con ?
Optional : is there another solution ?

According to your comment you have to suport gcc 4.4. As LTO started with gcc 4.5 (with all caution about early versions), the answer should be clearly. no -flto.
So, #include the code with all due caution, of course.
Update:
The file-extension should not be .c, though, but e.g. .inc (.i is also a bad idea). Even better: .h and change the functions to static inline. That still might not guarantee inlining, but that's the same as for all functions and it maintains the appearance of a clean header (although a longer inline function still is bad style).
Before doing all this, I'd properly profile, if the code really has a problem. One should concentrate on writing readable and maintainable code in the first place.

Related

unused `static inline` functions generate warnings with `clang`

When using gcc or clang, it's generally a good idea to enable a number of warnings, and a first batch of warnings is generally provided by -Wall.
This batch is pretty large, and includes the specific warning -Wunused-function.
Now, -Wunused-function is useful to detect static functions which are no longer invoked, meaning they are useless, and should therefore preferably be removed from source code.
When applying a "zero-warning" policy, it's no longer "preferable", but downright compulsory.
For performance reasons, some functions may be defined directly into header files *.h, so that they can be inlined at compile time (disregarding any kind of LTO magic). Such functions are generally declared and defined as static inline.
In the past, such functions would probably have been defined as macros instead, but it's considered better to make them static inline functions instead, whenever applicable (no funny type issue).
OK, so now we have a bunch of functions defined directly into header files, for performance reasons. A unit including such a header file is under no obligation to use all its declared symbols. Therefore, a static inline function defined in a header file may reasonably not be invoked.
For gcc, that's fine. gcc would flag an unused static function, but not an inline static one.
For clang though, the outcome is different : static inline functions declared in headers trigger a -Wunused-function warning if a single unit does not invoke them. And it doesn't take a lot of flags to get there : -Wall is enough.
A work-around is to introduce a compiler-specific extension, such as __attribute__((unused)), which explicitly states to the compiler that the function defined in the header may not necessarily be invoked by all its units.
OK, but now, the code which used to be clean C99 is including some form of specific compiler extension, adding to the weight of portability and maintenance.
The question therefore is more about the logic of such a choice : why does clang selects to trigger a warning when a static inline function defined in a header is not invoked ? In which case is that a good idea ?
And what does clang proposes to cover the relatively common case of inlined functions defined in header file, without requesting the usage of compiler extension ?
edit :
After further investigation, it appears the question is incorrect.
The warning is triggered in the editor (VSCode) using clang linter applying a selected list compilation flags (-Wall, etc.).
But when the source code is actually compiled with clang and with exactly the same list of flags, the "unused function" warning is not present.
So far, the results visible in the editor used to be exactly the ones found at compilation time. It's the first time I witness a difference.
So the problem seems related to the way the linter uses clang to produce its list of warnings. That's a much more complex and specific question.
Note the comment:
OK, sorry, this is actually different from expectation. It appears the warning is triggered in the editor using clang linter with selected compilation flags (-Wall, etc.). But when the source code is compiled with exactly the same flags, the "unused function" warning is actually not present. So far, the results visible in the editor used to be exactly the ones found at compilation time; it's the first time I witness a difference. So the problem seems related to the way the linter uses clang to produce its list of warnings. It seems to be a more complex question [than I realized].
I'm not sure you'll find any "why". I think this is a bug, possibly one that they don't care to fix. As you hint in your question, it does encourage really bad practice (annotation with compiler extensions where no annotation should be needed), and this should not be done; rather, the warning should just be turned off unless/until the bug is fixed.
If you haven't already, you should search their tracker for an existing bug report, and open one if none already exists.
Follow-up: I'm getting reports which I haven't verified that this behavior only happens for functions defined in source files directly, not from included header files. If that's true, it's nowhere near as bad, and probably something you can ignore.
'#ifdef USES_FUNTION_XYZ'
One would have to configure the used inline functions before including the header.
Sounds like a hassle and looks clumsy.
When using gcc or clang, it's generally a good idea to enable a number of warnings,
When using any C compiler, it's a good idea to ensure that the warning level is turned up, and to pay attention to the resulting warnings. Much breakage, confusion, and wasted effort can be saved that way.
Now, -Wunused-function is useful to detect static functions which are
no longer invoked, meaning they are useless, and should therefore
preferably be removed from source code. When applying a "zero-warning"
policy, it's no longer "preferable", but downright compulsory.
Note well that
Such zero-warning policies, though well-intended, are a crutch. I have little regard for policies that substitute inflexible rules for human judgement.
Such zero-warning policies can be subverted in a variety of ways, with disabling certain warnings being high on the list. Just how useful are they really, then?
Policy is adopted by choice, as a means to an end. Maybe not your choice personally, but someone's. If existing policy is not adequately serving the intended objective, or is interfering with other objectives, then it should be re-evaluated (though that does not necessarily imply that it will be changed).
For performance reasons, some functions may be defined directly into header files *.h, so that they can be inlined at compile time (disregarding any kind of LTO magic).
That's a choice. More often than not, one affording little advantage.
Such functions are generally declared and defined as static inline. In the past, such functions would probably have been defined as macros instead, but it's considered better to make them static inline functions instead, whenever applicable (no funny type issue).
Considered by whom? There are reasons to prefer functions over macros, but there are also reasons to prefer macros in some cases. Not all such reasons are objective.
A unit including such a header file is
under no obligation to use all its declared symbols.
Correct.
Therefore, a
static inline function defined in a header file may reasonably not be
invoked.
Well, that's a matter of what one considers "reasonable". It's one thing to have reasons to want to do things that way, but whether those reasons outweigh those for not doing it that way is a judgement call. I wouldn't do that.
The question therefore is more about the logic of such a choice : why
does clang selects to trigger a warning when a static inline function
defined in a header is not invoked ? In which case is that a good idea
?
If we accept that it is an intentional choice, one would presume that the Clang developers have a different opinion about how reasonable the practice you're advocating is. You should consider this a quality-of-implementation issue, there being no rules for whether compilers should emit diagnostics in such cases. If they have different ideas about what they should warn about than you do, then maybe a different compiler would be more suitable.
Moreover, it would be of little consequence if you did not also have a zero-warning policy, so multiple choices on your part are going into creating an issue for you.
And what does clang proposes to cover the relatively common case of
inlined functions defined in header file, without requesting the usage
of compiler extension ?
I doubt that clang or its developers propose any particular course of action here. You seem to be taking the position that they are doing something wrong. They are not. They are doing something that is inconvenient for you, and that therefore you (understandably) dislike. You will surely find others who agree with you. But none of that puts any onus on Clang to have a fix.
With that said, you could try defining the functions in the header as extern inline instead of static inline. You are then obligated to provide one non-inline definition of each somewhere in the whole program, too, but those can otherwise be lexically identical to the inline definitions. I speculate that this may assuage Clang.

Size optimization options

I am trying to sort out an embedded project where the developers took the option of including all the h and c files into a c file, then they can compile just that one file with the -whole-program option to get good size optimization.
I hate this and am determined to make this into a traditional program just using LTO to achieve the same.
The versions included with the dev kit are;
aps-gcc (GCC) 4.7.3 20130524 (Cortus)
GNU ld (GNU Binutils) 2.22
With one .o file .text is 0x1c7ac, fractured into 67 .o files .text comes out as 0x2f73c, I added the LTO stuff and reduced it to 0x20a44, good but nowhere near enough.
I have tried --gc-sections and using the linker plugin option but they made no further improvment.
Any suggestions, am I see the right sort of improvement from LTO?
To get LTO to work perfectly you need to have the same information and optimisation algorithms available at link stage as you have at compile stage. The GNU tools cannot do this and I believe this was actually one of the motivating factors in the creation of LLVM/Clang.
If you want to inspect the difference in detail I'd suggest you generate a Map file (ld option -Map <filename>) for each option and see if there are functions which haven't been in-lined or functions that are larger. The lack of in-lining you can manually resolve by forcing those functions to inline by moving the definition of the function into a header file and defining it as extern inline which effectively turns it into a macro (this is a GNU extension).
Larger functions are likely not being subject to constant propagation and I don't think there's anything you can do about that. You can make some improvements by carefully declaring the function attributes such as const, leaf, noreturn, pure, and returns_nonnull. These effectively promise that the function will behave in a particular way that the compiler may otherwise detect if using a single compilation unit, and that allow additional optimisations.
In contrast, Clang can compile your object code to a special kind of bytecode (LLVM stands for Low Level Virtual Machine, like JVM is Java Virtual Machine, and runs bytecode) and then optimisation of this bytecode can be performed at link time (or indeed run-time, which is cool). Since this bytecode is what is optimised whether you do LTO or not, and the optimisation algorithms are common between the compiler and the linker, in theory Clang/LLVM should give exactly the same results whether you use LTO or not.
Unfortunately now that the C backend has been removed from LLVM I don't know of any way to use the LLVM LTO capabilities for the custom CPU you're targeting.
In my opinion, the method chosen by the previous developers is the correct one. It is the method that gives the compiler the most information and thus the most opportunities to perform the optimizations that you want. It is a terrible way to compile (any change will require the whole project to be compiled) so marking this as just an option is a good idea.
Of course, you would have to run all your integration tests against such a build, but that should be trivial to do. What is the downside of the chosen approach except for compilation time (which shouldn't be an issue because you don't need to build in that manner all the time ... just for integration tests).

Is commenting out a #include a safe way to see if it's unneeded?

I like to keep my files clean, so I prefer to take out includes I don't need. Lately I've been just commenting the includes out and seeing if it compiles without warnings (-Wall -Wextra -pedantic, minus a couple very specific ones). I figure if it compiles without warnings I didn't need it.
Is this actually a safe way to check if an include is needed or can it introduce UB or other problems? Are there any specific warnings I need to be sure are enabled to catch potential problems?
n.b. I'm actually using Objective C and clang, so anything specific to those is appreciated, but given the flexibility of Objective C I think if there's any trouble it will be a general C thing. Certainly any problems in C will affect Objective C.
In principle, yes.
The exception would be if two headers interact in some hidden way. Say, if you:
include two different headers which define the same symbol differently,
both definitions are syntactically valid and well-typed,
but one definition is good, the other breaks your program at run-time.
Hopefully, your header files are not structured like that. It's somewhat unlikely, though not inconceivable.
I'd be more comfortable doing this if I had good (unit) tests.
Usually just commenting out the inclusion of the header is safe, meaning: if the header is needed then there will be compiler errors when you remove it, and (usually) if the header is not needed, the code will still compile fine.
This should not be done without inspecting the header to see what it adds though, as there is the (not exactly typical) possibility that a header only provides optional #define's (or #undef's) which will alter, but not break, the way a program is compiled.
The only way to be sure is to build your code without the header (if it's able to build in the first place) and run a proper regimen of testing to ensure its behavior has not changed.
No. Apart from the reasons already mentioned in other answers, it's possible that the header is needed and another header includes it indirectly. If you remove the #include, you won't see an error but there may be errors on other platforms.
In general, no. It is easy to introduce silent changes.
Suppose header.h defines some macros like
#define WITH_FEATURE_FOO
The C file including header.h tests the macro
#ifdef WITH_FEATURE_FOO
do_this();
#else
do_that();
#endif
Your files compile cleanly and with all warnings enabled with or without the inclusion of header.h, but the result behaves differently. The only way to get a definitive answer is to analyze which identifiers a header defines/declares and see if at least one of them appears in the preprocessed C file.
One tool that does this is FlexeLint from Gimpel. I don't get paid for saying this, even though they should :-) If you want to avoid shelling out big bucks, an approach I have been taking is compiling a C file to an object file with and without the header, if both succeed check for identical object files. If they are the same you don't need the header
(but watch our for include directives wrapped in #ifdefs that are enabled by a -DWITH_FEATURE_FOO option).

Single Source Code vs Multiple Files + Libraries

How much effect does having multiple files or compiled libraries vs. throwing everything (>10,000 LOC) into one source have on the final binary? For example, instead of linking a Boost library separately, I paste its code, along with my original source, into one giant file for compilation. And along the same line, instead of feeding several files into gcc, pasting them all together, and giving only that one file.
I'm interested in the optimization differences, instead of problems (horror) that would come with maintaining a single source file of gargantuan proportions.
Granted, there can only be link-time optimization (I may be wrong), but is there a lot of difference between optimization possibilities?
If the compiler can see all source code, it can optimize better if your compiler has some kind of Interprocedural Optimization (IPO) option turned on. IPO differs from other compiler optimization because it analyzes the entire program; other optimizations look at only a single function, or even a single block of code
Here is some interprocedural optimization that can be done, see here for more:
Inlining
Constant propagation
mod/ref analysis
Alias analysis
Forward substitution
Routine key-attribute propagation
Partial dead call elimination
Symbol table data promotion
Dead function elimination
Whole program analysis
GCC supports this kind of optimization.
This interprocedural optimization can be used to analyze and optimize the function being called.
If compiler can not see the source code of the library function, it cannot do such optimization.
Note that some modern compilers (clang/LLVM, icc and recently even gcc) now support link-time-optimization (LTO) to minimize the effect of separate compilation. Thus you gain the benefits of separate compilation (maintenance, faster compilation, etc.) and these of whole program analysis.
By the way, it seems like gcc has supported -fwhole-program and --combine since version 4.1. You have to pass all source files together, though.
Finally, since BOOST is mostly header files (templates) that are #included, you cannot gain anything from adding these to your source code.

Any good reason to #include source (*.c *.cpp) files?

i've been working for some time with an opensource library ("fast artificial neural network"). I'm using it's source in my static library. When i compile it however, i get hundreds of linker warnings which are probably caused by the fact that the library includes it's *.c files in other *.c files (as i'm only including some headers i need and i did not touch the code of the lib itself).
My question: Is there a good reason why the developers of the library used this approach, which is strongly discouraged? (Or at least i've been told all my life that this is bad and from my own experience i believe it IS bad). Or is it just bad design and there is no gain in this approach?
I'm aware of this related question but it does not answer my question. I'm looking for reasons that might justify this.
A bonus question: Is there a way how to fix this without touching the library code too much? I have a lot of work of my own and don't want to create more ;)
As far as I see (grep '#include .*\.c'), they only do this in doublefann.c, fixedfann.c, and floatfann.c, and each time include the reason:
/* Easy way to allow for build of multiple binaries */
This exact use of the preprocessor for simple copy-pasting is indeed the only valid use of including implementation (*.c) files, and relatively rare. (If you want to include some code for another reason, just give it a different name, like *.h or *.inc.) An alternative is to specify configuration in macros given to the compiler (e.g. -DFANN_DOUBLE, -DFANN_FIXED, or -DFANN_FLOAT), but they didn't use this method. (Each approach has drawbacks, so I'm not saying they're necessarily wrong, I'd have to look at that project in depth to determine that.)
They provide makefiles and MSVS projects which should already not link doublefann.o (from doublefann.c) with either fann.o (from fann.c) or fixedfann.o (from fixedfann.c) and so on, and either their files are screwed up or something similar has gone wrong.
Did you try to create a project from scratch (or use your existing project) and add all the files to it? If you did, what is happening is each implementation file is being compiled independently and the resulting object files contain conflicting definitions. This is the standard way to deal with implementation files and many tools assume it. The only possible solution is to fix the project settings to not link these together. (Okay, you could drastically change their source too, but that's not really a solution.)
While you're at it, if you continue without using their project settings, you can likely skip compiling fann.c, et. al. and possibly just removing those from the project is enough – then they won't be compiled and linked. You'll want to choose exactly one of double-/fixed-/floatfann to use, otherwise you'll get the same link errors. (I haven't looked at their instructions, but would not be surprised to see this summary explained a bit more in-depth there.)
Including C/C++ code leads to all the code being stuck together in one translation unit. With a good compiler, this can lead to a massive speed boost (as stuff can be inlined and function calls optimized away).
If actual code is going to be included like this, though, it should have static in most of its declarations, or it will cause the warnings you're seeing.
If you ever declare a single global variable or function in that .c file, it cannot be included in two places which both compile to the same binary, or the two definitions will collide. If it is included in even one place, it cannot also be compiled on its own while still being linked into the same binary as its user.
If the file is only included in one place, why not just make it a discrete compilation unit (and use its globals via extern declarations)? Why bother having it included at all?
If your C files declare no global variables or functions, they are header files and should be named as such.
Therefore, by exhaustive search, I can say that the only time you would ever potentially want to include C files is if the same C code is used in building multiple different binaries. And even there, you're increasing your compile time for no real gain.
This is assuming that functions which should be inlined are marked inline and that you have a decent compiler and linker.
I don't know of a quick way to fix this.
I don't know that library, but as you describe it, it is either bad practice or your understanding of how to use it is not good enough.
A C project that wants to be included by others should always provide well structured .h files for others and then the compiled library for linking. If it wants to include function definitions in header files it should either mark them as static (old fashioned) or as inline (possible since C99).
I haven't looked at the code, but it's possible that the .c or .cpp files being included actually contain code that works in a header. For example, a template or an inline function. If that is the case, then the warnings would be spurious.
I'm doing this at the moment at home because I'm a relative newcomer to C++ on Linux and don't want to get bogged down in difficulties with the linker. But I wouldn't recommend it for proper work.
(I also once had to include a header.dat into a C++ program, because Rational Rose didn't allow headers to be part of the issued software and we needed that particular source file on the running system (for arcane reasons).)

Resources