Why to split C files into multiple ones - c

First of all, I know how to split a C file into multiple ones.
My question is, is there any advantage other than readability? What is the main reason to split C files?

If you split a large C file into several translation units, you practically have to declare the functions involved as (e.g. extern) in some common included header.
The advantage of splitting is that incremental build time could become slightly smaller. If you change only one statement in one function you just need to compile the file defining that function (and if that file is small, its compilation is fast). However, the compiler needs to parse all the headers. And you may want to define static inline functions in your header files (and that makes them bigger).
The disadvantage of splitting is that the developer might have harder time to find a given function (but tools like ctags help) and that the compiler could probably optimize less (e.g. won't inline some function calls, unless you enable link time optimization)
So it is a matter of taste and habit. Personally I like having C or C++ files of more than a thousand lines (and probably less than ten thousand lines), but YMMV. And I dislike projects having one function (of less than a hundred lines) per file (you'll need lots of them in such case). But having one large source file of a hundred thousand lines is generally unreasonable.
I also find source files keeping together related functions more readable (since it is simpler to search a function definition in one file than in many ones).
Notice that header files may expand to something quite big (and it is even more true in modern C++ : a standard header like <vector> or <map> may expand to more than ten thousand lines of code), e.g. <stdio.h> expands to two thousand lines (on my Debian/Linux/x86-64) so having lots of small source files can slow the full build time (since the compiler really sees the preprocessed form, after expansion of #include directives).
With GCC, you may want to have a single header (including other ones) and pre-compile it.
BTW, splitting a big file into several smaller ones, or merging several small files is not a big deal. So I think it is not very important.
In some cases (embedded C program on a small microcontroller, Arduino like), binary program size is very important, and that could be a reason to have many object files (so many C source files) since you'll only link those that are really needed.

If you are going to place the functions in a library used by possibly many different programs, then placing each function in a separate file has a distinct advantage.
Suppose the library contains 500 functions, and program1 references only 2 of those, while program2 references 150 of them. Then only 2 of the functions will be loaded into the first executable, and only 150 will be loaded into the second one.
If all of the functions were contained in a single file, all 500 functions would be loaded into each of the programs, making them very large and containing code that is never used.

Related

Does it make sense to combine my final program into a single C file to save memory?

I understand that it's good practice to split up a C program into multiple .h and .c files, but since each .h file has an #ifndef #define #endif "include guard", don't all those defined constants take up memory in my final program? If I'm really trying to be conservative with memory usage, would combining my final program into one large C file to get rid of all the "include guards" help?
The include guards are not part of the content of the final program (of the object modules produced by the compiler or of the executable file produced by the linker). The include guards are instructions to the compiler during compilation; they do not contribute data to the final program file.
Constants defined with #define also do not appear as data in the final program.
It is possible that some objects define in headers may result in multiple objects being defined in the final program. For example, static int x = 3; could result in multiple occurrences of this x appearing in the final program. Generally, headers should avoid defining objects; they should only declare identifiers for objects that are defined in source files.
Preprocessor macros don't normally use any additional memory or storage, so reducing the use of these things is probably not be a good reason to combine multiple source files together.
However, there might be other good reasons to do so, in large projects. For example, the maintainers of SQLite3 claim (see
https://www.sqlite.org/amalgamation.html) that merging all the sources together leads to 5-10% faster execution. Presumably this is because it allows for better compiler optimization.
However, it's very difficult to maintain vast C source files, and they take an age to compile.

Problems of including too many header files in C

Does including too many header files increase the size of the source file. Does it also increase the size of executable? Do these header files increase the compilation time?
For example if i add these header files in my program do they increase of size of source file or executable file or both?
#include <stdio.h>
#include "header1.h"
#include "header2.h"
What are the other problems of including too many header files?
Does including too many header files increase the size of the source file.
It increases with as many letters as you type. So if 1 letter is 1 byte on your system, then adding #include <stdio.h> increases the source code file size by at least 18 bytes. This shouldn't matter to you unless you are using a computer from the mid-1980s.
Does it also increase the size of executable?
No. Only used functions increase the size of the executable.
Do these header files increase the compilation time?
Generally yes, though compilers use various tricks such as "precompiled headers" for its own libraries. Again, this isn't a problem unless you are using 1980s stuff or worse (such as Eclipse).
What are the other problems of including too many header files?
Your main concern about including headers should be to not include stuff that you don't use. Every include creates a dependency, and also means more identifiers and symbols added to the global namespace.
Does including too many header files increase the size of the source file.
Yes, For each additional character added to a source file, for example "#include <stdio.h>" increases the physical size of the source file precisely by the number of characters in that statement, eg": strlen("#include <stdio.h>"); bytes. (and, depending on how OS allocates file block size, it could be seen by the OS as an extra kByte.) More importantly though, at compile time the contents of each header file #included will effectively be expanded into source code that is fed to the compiler.
Does it also increase the size of executable?
Yes/No/Possibly. Depending on what is actually used in the header file. Optimizing compilers can exclude whatever is not needed in an executable. If nothing is used, there will no additional size to the executable. There will however be additional work done during compile-time because even if there is nothing useful in the header file, compiler does not know this until it is processed.
Do these header files increase the compilation time?
Compared to what? i.e. If there are, within a header file, necessary components to allow a build to occur, i.e. by containing prototypes of functions, #defines, etc, then compile time is just normal compile time. But if you have been compiling with say 3 necessary header files for awhile, and decide that you want to add a new library (and it's corresponding header file.), then by all means, yes, the next compile will take a little longer than those previous.
What are the other problems of including too many header files?
Too many?. If each and every header file is necessary, then there are not too many. (with this caveat about maximum header file depth.) However, if a header file is found to be unnecessary, its code bloat, get rid of it. It adds to compile time, as each header file regardless of whether there is anything useful in it has to be looked at by the build process. Even worse unnecessary header files add complexity and difficulty to the tasks of future maintainers.
There is a good post here discussing this in more detail.
Additionally, this is a fun page that also discusses header files.
Header files should not produce any extra code unless they are poorly designed, an d you will probably encounter redefinition problems if they do and you include them more than once. By code here I mean "machine code", that is executable code.
About the source code, the compiler ultimately sees the source code as one big source code with the #include directives replaced by the content of the file, so adding more header files will increase the compile time (as the apparent source code will be longer). So including unnesessary files should be avoided.
Adding header files will increase the size of the intermediate source file, taking into account the inclusions. Modern compilers may not even generate this intermediate file explicitly -- it may be absorbed into the overall compilation process. This is a matter of compiler design. As a developer, you probably won't ever see the fully-expanded file unless you ask for it (e.g., gcc -E).
Adding header files will not necessarily increase the size of the compiled code -- if all the headers contain is declarations and constant definitions, they won't increase the size much, if at all. If they contain actual code -- which isn't a particularly common practice -- they might have some small effect on the executable size.
Adding header files will probably have some effect on the compilation time but, really, this isn't a question anybody should be asking. If you need the headers, you need the headers. If it slows down compilation, what's the alternative? Don't compile?
If the question is really about how to distribute code between headers and source files, so as to improve some aspect of the build process, then that's a very complicated question to answer. If the question is about what harm is done by including a bunch of headers you don't use, the answer with modern compilers is: very little, from a functional perspective. However, including some header gives the reader the impression that the source actually uses the features it declares, and that's bad for readability. You should do your future self, or your colleagues, a favour and try not to include headers that aren't used. But if you need them, you need them, and there's little point worry about the consequences too much,
Does including too many header files increase the size of the source file?
The more characters in the source file, the more size has the source file.
But it's only about the #include directives itself. Not the content of header files - the source file doesn't get expanded by the content of the headers.
When you #include a header, the compiler gets known about to read it at that point of time, but the source file isn't changed.
So, yes.
Does it also increase the size of executable?
Depends on the content of the headers. If they contain definitions then yes.
Do these header files increase the compilation time?
The more to read and evaluate from the compiler, the longer the time to compile. So yes.
What are the other problems of including too many header files?
As said before, the time to evaluate might take longer and thus the more header files, the slower the compilation. But there is nothing wrong to add as much useful headers as you like. Just don't add unnecessary headers, which slow down the compilation.

Split C file by its functions

How can I automatically split a single C file with various functions in it into various files with only a single function each? Anyone have a script or let's say a plugin on notepad++ that could do it? Thank you
It may not even be possible. If a single global static variable exists in one of the files, it shall be shared by all the functions of that file but not be accessible (even with the extern modifier) from functions of other files. And even without that, processing of includes and global variables will be a nightmare.
Anyway, on Unix-Linux, the good old ctags command should be close to your requirements: it does not split the files, but creates an index file (called a tags file) which contains the file and position of all functions from the specified C, Pacal, Fortran, yacc, lex, and Lisp sources. The man page says:
Using the tags file, ex [or vi, vim, etc.] can quickly locate these object definitions.
Depending upon the options provided to ctags, objects will consist of
subroutines, typedefs, defines, structs, enums and unions.
You can either use it (if on Unix world) or mimic it, on Windows for example.
For reasons explained in Serge Ballesta's answer, splitting a single C file into smaller pieces is not automatable in general.
And having several small files instead of a larger one is generally a bad idea. The code becomes less readable, its execution could be slower (because there are less inlining and optimizing opportunities for the compiler).
In some cases, you might want to split a big C file (e.g. more than ten thousands lines of source code) into a few smaller ones (e.g. at least a thousands lines of code each). This may require some work, like renaming static functions or variables into a longer (and globally unique) name declared as extern, moving some short functions (or adding some macros) into header files and declaring them as static inline, etc. This cannot be really automatized in the general case.
My recommendation is often to merge a few small (but related) files into one single bigger one. As a rule of thumb, I would suggest having files of more than a thousand lines each, but YMMV.
In particular, there is no reason to have only one function definition in each of your source file. This practically forbids inlining (unless you compile with link-time-optimization, a very expensive approach).
Look into existing free software projects (e.g. on github) for inspiration. Or look into the Linux kernel source code.
Splitting a C file into smaller ones (or conversely, merging several source files in a single bigger one) generally requires some code refactoring. In many cases, it is quite simple (perhaps even as trivial as copy & pasting some functions one by one); in some cases, it could be difficult. You should do it manually and incrementally (and enable all warnings in your compiler, to help you find mistakes in your refactoring; don't forget to recompile often!). You may want to improve your naming conventions in your code while you split it.
Of course you need a version control system (I recommend git), and you'll compile and commit your code several times while splitting it. You need also a good source code editor (I recommend GNU emacs, but it is a matter of taste; some people prefer vim, etc ....).
You certainly don't want to automatize C file splitting (you might write some scripts to help you, generally it is not worth the trouble). You need to control that split.

Why not concatenate C source files before compilation? [duplicate]

This question already has answers here:
#include all .cpp files into a single compilation unit?
(6 answers)
The benefits / disadvantages of unity builds? [duplicate]
(3 answers)
Closed 6 years ago.
I come from a scripting background and the preprocessor in C has always seemed ugly to me. None the less I have embraced it as I learn to write small C programs. I am only really using the preprocessor for including the standard libraries and header files I have written for my own functions.
My question is why don't C programmers just skip all the includes and simply concatenate their C source files and then compile it? If you put all of your includes in one place you would only have to define what you need once, rather than in all your source files.
Here's an example of what I'm describing. Here I have three files:
// includes.c
#include <stdio.h>
// main.c
int main() {
foo();
printf("world\n");
return 0;
}
// foo.c
void foo() {
printf("Hello ");
}
By doing something like cat *.c > to_compile.c && gcc -o myprogram to_compile.c in my Makefile I can reduce the amount of code I write.
This means that I don't have to write a header file for each function I create (because they're already in the main source file) and it also means I don't have to include the standard libraries in each file I create. This seems like a great idea to me!
However I realise that C is a very mature programming language and I'm imagining that someone else a lot smarter than me has already had this idea and decided not to use it. Why not?
Some software are built that way.
A typical example is SQLite. It is sometimes compiled as an amalgamation (done at build time from many source files).
But that approach has pros and cons.
Obviously, the compile time will increase by quite a lot. So it is practical only if you compile that stuff rarely.
Perhaps, the compiler might optimize a bit more. But with link time optimizations (e.g. if using a recent GCC, compile and link with gcc -flto -O2) you can get the same effect (of course, at the expense of increased build time).
I don't have to write a header file for each function
That is a wrong approach (of having one header file per function). For a single-person project (of less than a hundred thousand lines of code, a.k.a. KLOC = kilo line of code), it is quite reasonable -at least for small projects- to have a single common header file (which you could pre-compile if using GCC), which will contain declarations of all public functions and types, and perhaps definitions of static inline functions (those small enough and called frequently enough to profit from inlining). For example, the sash shell is organized that way (and so is the lout formatter, with 52 KLOC).
You might also have a few header files, and perhaps have some single "grouping" header which #include-s all of them (and which you could pre-compile). See for example jansson (which actually has a single public header file) and GTK (which has lots of internal headers, but most applications using it have just one #include <gtk/gtk.h> which in turn include all the internal headers). On the opposite side, POSIX has a big lot of header files, and it documents which ones should be included and in which order.
Some people prefer to have a lot of header files (and some even favor putting a single function declaration in its own header). I don't (for personal projects, or small projects on which only two or three persons would commit code), but it is a matter of taste. BTW, when a project grows a lot, it happens quite often that the set of header files (and of translation units) changes significantly. Look also into REDIS (it has 139 .h header files and 214 .c files i.e. translation units totalizing 126 KLOC).
Having one or several translation units is also a matter of taste (and of convenience and habits and conventions). My preference is to have source files (that is translation units) which are not too small, typically several thousand lines each, and often have (for a small project of less than 60 KLOC) a common single header file. Don't forget to use some build automation tool like GNU make (often with a parallel build through make -j; then you'll have several compilation processes running concurrently). The advantage of having such a source file organization is that compilation is reasonably quick. BTW, in some cases a metaprogramming approach is worthwhile: some of your (internal header, or translation units) C "source" files could be generated by something else (e.g. some script in AWK, some specialized C program like bison or your own thing).
Remember that C was designed in the 1970s, for computers much smaller and slower than your favorite laptop today (typically, memory was at that time a megabyte at most, or even a few hundred kilobytes, and the computer was at least a thousand times slower than your mobile phone today).
I strongly suggest to study the source code and build some existing free software projects (e.g. those on GitHub or SourceForge or your favorite Linux distribution). You'll learn that they are different approaches. Remember that in C conventions and habits matter a lot in practice, so there are different ways to organize your project in .c and .h files. Read about the C preprocessor.
It also means I don't have to include the standard libraries in each file I create
You include header files, not libraries (but you should link libraries). But you could include them in each .c files (and many projects are doing that), or you could include them in one single header and pre-compile that header, or you could have a dozen of headers and include them after system headers in each compilation unit. YMMV. Notice that preprocessing time is quick on today's computers (at least, when you ask the compiler to optimize, since optimizations takes more time than parsing & preprocessing).
Notice that what goes into some #include-d file is conventional (and is not defined by the C specification). Some programs have some of their code in some such file (which should then not be called a "header", just some "included file"; and which then should not have a .h suffix, but something else like .inc). Look for example into XPM files. At the other extreme, you might in principle not have any of your own header files (you still need header files from the implementation, like <stdio.h> or <dlfcn.h> from your POSIX system) and copy and paste duplicated code in your .c files -e.g. have the line int foo(void); in every .c file, but that is very bad practice and is frowned upon. However, some programs are generating C files sharing some common content.
BTW, C or C++14 do not have modules (like OCaml has). In other words, in C a module is mostly a convention.
(notice that having many thousands of very small .h and .c files of only a few dozen lines each may slow down your build time dramatically; having hundreds of files of a few hundred lines each is more reasonable, in term of build time.)
If you begin to work on a single-person project in C, I would suggest to first have one header file (and pre-compile it) and several .c translation units. In practice, you'll change .c files much more often than .h ones. Once you have more than 10 KLOC you might refactor that into several header files. Such a refactoring is tricky to design, but easy to do (just a lot of copy&pasting chunk of codes). Other people would have different suggestions and hints (and that is ok!). But don't forget to enable all warnings and debug information when compiling (so compile with gcc -Wall -g, perhaps setting CFLAGS= -Wall -g in your Makefile). Use the gdb debugger (and valgrind...). Ask for optimizations (-O2) when you benchmark an already-debugged program. Also use a version control system like Git.
On the contrary, if you are designing a larger project on which several persons would work, it could be better to have several files -even several header files- (intuitively, each file has a single person mainly responsible for it, with others making minor contributions to that file).
In a comment, you add:
I'm talking about writing my code in lots of different files but using a Makefile to concatenate them
I don't see why that would be useful (except in very weird cases). It is much better (and very usual and common practice) to compile each translation unit (e.g. each .c file) into its object file (a .o ELF file on Linux) and link them later. This is easy with make (in practice, when you'll change only one .c file e.g. to fix a bug, only that file gets compiled and the incremental build is really quick), and you can ask it to compile object files in parallel using make -j (and then your build goes really fast on your multi-core processor).
You could do that, but we like to separate C programs into separate translation units, chiefly because:
It speeds up builds. You only need to rebuild the files that have changed, and those can be linked with other compiled files to form the final program.
The C standard library consists of pre-compiled components. Would you really want to have to recompile all that?
It's easier to collaborate with other programmers if the code base is split up into different files.
Your approach of concatenating .c files is completely broken:
Even though the command cat *.c > to_compile.c will put all functions into a single file, order matters: You must have each function declared before its first use.
That is, you have dependencies between your .c files which force a certain order. If your concatenation command fails to honor this order, you won't be able to compile the result.
Also, if you have two functions that recursively use each other, there is absolutely no way around writing a forward declaration for at least one of the two. You may as well put those forward declarations into a header file where people expect to find them.
When you concatenate everything into a single file, you force a full rebuild whenever a single line in your project changes.
With the classic .c/.h split compilation approach, a change in the implementation of a function necessitates recompilation of exactly one file, while a change in a header necessitates recompilation of the files that actually include this header. This can easily speed up the rebuild after a small change by a factor of 100 or more (depending on the count of .c files).
You loose all the ability for parallel compilation when you concatenate everything into a single file.
Have a big fat 12 core processor with hyper-threading enabled? Pity, your concatenated source file is compiled by a single thread. You just lost a speedup of a factor greater than 20... Ok, this is an extreme example, but I have build software with make -j16 already, and I tell you, it can make a huge difference.
Compilation times are generally not linear.
Usually compilers contain at least some algorithms that have a quadratic runtime behavior. Consequently, there is usually some threshold from which on aggregated compilation is actually slower than compilation of the independent parts.
Obviously, the precise location of this threshold depends on the compiler and the optimization flags you pass to it, but I have seen a compiler take over half an hour on a single huge source file. You don't want to have such an obstacle in your change-compile-test loop.
Make no mistake: Even though it comes with all these problems, there are people who use .c file concatenation in practice, and some C++ programmers get pretty much to the same point by moving everything into templates (so that the implementation is found in the .hpp file and there is no associated .cpp file), letting the preprocessor do the concatenation. I fail to see how they can ignore these problems, but they do.
Also note, that many of these problems only become apparent with larger project sizes. If your project is less than 5000 lines of code, it's still relatively irrelevant how you compile it. But when you have more than 50000 lines of code, you definitely want a build system that supports incremental and parallel builds. Otherwise, you are wasting your working time.
With modularity, you can share your library without sharing the code.
For large projects, if you change a single file, you would end up
compiling the complete project.
You may run out of memory more easily when you attempt to compile large projects.
You may have circular dependencies in modules, modularity helps in maintaining those.
There may be some gains in your approach, but for languages like C, compiling each module makes more sense.
Because splitting things up is good program design. Good program design is all about modularity, autonomous code modules, and code re-usability. As it turns out, common sense will get you very far when doing program design: Things that don't belong together shouldn't be placed together.
Placing non-related code in different translation units means that you can localize the scope of variables and functions as much as possible.
Merging things together creates tight coupling, meaning awkward dependencies between code files that really shouldn't even have to know about each other's existence. This is why a "global.h" which contains all the includes in a project is a bad thing, because it creates a tight coupling between every non-related file in your whole project.
Suppose you are writing firmware to control a car. One module in the program controls the car FM radio. Then you re-use the radio code in another project, to control the FM radio in a smart phone. And then your radio code won't compile because it can't find brakes, wheels, gears, etc. Things that doesn't make the slightest sense for the FM radio, let alone the smart phone to know about.
What's even worse is that if you have tight coupling, bugs escalate throughout the whole program, instead of staying local to the module where the bug is located. This makes the bug consequences far more severe. You write a bug in your FM radio code and then suddenly the brakes of the car stop working. Even though you haven't touched the brake code with your update that contained the bug.
If a bug in one module breaks completely non-related things, it is almost certainly because of poor program design. And a certain way to achieve poor program design is to merge everything in your project together into one big blob.
Header files should define interfaces - that's a desirable convention to follow. They aren't meant to declare everything that's in a corresponding .c file, or a group of .c files. Instead, they declare all functionality in the .c file(s) that is available to their users. A well designed .h file comprises a basic document of the interface exposed by the code in the .c file even if there isn't a single comment in it. One way to approach the design of a C module is to write the header file first, and then implement it in one or more .c files.
Corollary: functions and data structures internal to the implementation of a .c file don't normally belong in the header file. You might need forward declarations, but those should be local and all variables and functions thus declared and defined should be static: if they are not a part of the interface, the linker shouldn't see them.
While you can still write your program in a modular way and build it as a single translation unit, you will miss all the mechanisms C provides to enforce that modularity. With multiple translation units you have fine control on your modules' interfaces by using e.g. extern and static keywords.
By merging your code into a single translation unit, you will miss any modularity issues you might have because the compiler won't warn you about them. In a big project this will eventually result in unintended dependencies spreading around. In the end, you will have trouble changing any module without creating global side-effects in other modules.
The main reason is compilation time. Compiling one small file when you change it may take a short amount of time. If you would however compile the whole project whenever you change single line, then you would compile - for example - 10,000 files each time, which could take a lot longer.
If you have - as in the example above - 10,000 source files and compiling one takes 10 ms, then the whole project builds incrementally (after changing single file) either in (10 ms + linking time) if you compile just this changed file, or (10 ms * 10000 + short linking time) if you compile everything as a single concatenated blob.
If you put all of your includes in one place you would only have to define what you need once, rather than in all your source files.
That's the purpose of .h files, so you can define what you need once and include it everywhere. Some projects even have an everything.h header that includes every individual .h file. So, your pro can be achieved with separate .c files as well.
This means that I don't have to write a header file for each function I create [...]
You're not supposed to write one header file for every function anyway. You're supposed to have one header file for a set of related functions. So your con is not valid either.
This means that I don't have to write a header file for each function I create (because they're already in the main source file) and it also means I don't have to include the standard libraries in each file I create. This seems like a great idea to me!
The pros you noticed are actually a reason why this is sometimes done in a smaller scale.
For large programs, it's impractical. Like other good answers mentioned, this can increase build times substantially.
However, it can be used to break up a translation unit into smaller bits, which share access to functions in a way reminiscent of Java's package accessibility.
The way the above is achieved involves some discipline and help from the preprocessor.
For example, you can break your translation unit into two files:
// a.c
static void utility() {
}
static void a_func() {
utility();
}
// b.c
static void b_func() {
utility();
}
Now you add a file for your translation unit:
// ab.c
static void utility();
#include "a.c"
#include "b.c"
And your build system doesn't build either a.c or b.c, but instead builds only ab.o out of ab.c.
What does ab.c accomplish?
It includes both files to generate a single translation unit, and provides a prototype for the utility. So that the code in both a.c and b.c could see it, regardless of the order in which they are included, and without requiring the function to be extern.

Good C header style

My C headers usually resemble the following style to avoid multiple inclusion:
#ifndef <FILENAME>_H
#define <FILENAME>_H
// define public data structures / prototypes, macros etc.
#endif /* !<FILENAME>_H */
However, in his Notes on Programming in C, Rob Pike makes the following argument about header files:
There's a little dance involving #ifdef's that can prevent a file being read twice, but it's usually done wrong in practice - the #ifdef's are in the file itself, not the file that includes it. The result is often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers) the most expensive phase.
On the one hand, Pike is the only programmer I actually admire. On the other hand, putting several #ifdefs in multiple source files instead of putting one #ifdef in a single header file feels needlessly awkward.
What is the best way to handle the problem of multiple inclusion?
In my opinion, use the method that requires less of your time (which likely means putting the #ifdefs in the header files). I don't really mind if the compiler has to work harder if my resulting code is cleaner. If, perhaps, you are working on a multi-million line code base that you constantly have to fully rebuild, maybe the extra savings is worth it. But in most cases, I suspect that the extra cost is not usually noticeable.
Keep doing what you do - It's clear, less bug-prone, and well known by compiler writers, so not as inefficient as it maybe was a decade or two ago.
You could use the non-standard #pragma once - If you search, there's probably at least a bookshelf's worth of include guards vs pragma once discussion, so I'm not going to recommend one over the other.
Pike wrote some more about it in https://talks.golang.org/2012/splash.article:
In 1984, a compilation of ps.c, the source to the Unix ps command, was
observed to #include <sys/stat.h> 37 times by the time all the
preprocessing had been done. Even though the contents are discarded 36
times while doing so, most C implementations would open the file, read
it, and scan it all 37 times. Without great cleverness, in fact, that
behavior is required by the potentially complex macro semantics of the
C preprocessor.
Compilers have become quite clever since: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html, so this is less of an issue now.
The construction of a single C++ binary at Google can open and read
hundreds of individual header files tens of thousands of times. In
2007, build engineers at Google instrumented the compilation of a
major Google binary. The file contained about two thousand files that,
if simply concatenated together, totaled 4.2 megabytes. By the time
the #includes had been expanded, over 8 gigabytes were being delivered
to the input of the compiler, a blow-up of 2000 bytes for every C++
source byte.
As another data point, in 2003 Google's build system was moved from a
single Makefile to a per-directory design with better-managed, more
explicit dependencies. A typical binary shrank about 40% in file size,
just from having more accurate dependencies recorded. Even so, the
properties of C++ (or C for that matter) make it impractical to verify
those dependencies automatically, and today we still do not have an
accurate understanding of the dependency requirements of large Google
C++ binaries.
The point about binary sizes is still relevant. Compilers (linkers) are quite conservative regarding stripping unused symbols. How to remove unused C/C++ symbols with GCC and ld?
In Plan 9, header files were forbidden from containing further
#include clauses; all #includes were required to be in the top-level C file. This required some discipline, of course—the programmer was
required to list the necessary dependencies exactly once, in the
correct order—but documentation helped and in practice it worked very
well.
This is a possible solution. Another possiblity is to have a tool that manages the includes for you, for example MakeDeps.
There is also unity builds, sometimes called SCU, single compilation unit builds. There are tools to help manage that, like https://github.com/sakra/cotire
Using a build system that optimizes for the speed of incremental compilation can be advantageous too. I am talking about Google's Bazel and similar. It does not protect you from a change in a header file that is included in a large number of other files, though.
Finally, there is a proposal for C++ modules in the works, great stuff https://groups.google.com/a/isocpp.org/forum/#!forum/modules. See also What exactly are C++ modules?
The way you're currently doing it is the common way. Pike's method cuts a bit on compilation time, but with modern compilers probably not very much (when Pike wrote his notes, compilers weren't optimizer-bound), it clutters modules and its bug-prone.
You could still cut on multi-inclusion by not including headers from headers, but instead documenting them with "include <foodefs.h> before including this header."
I recommend you put them in the source-file itself. No need to complain about some thousand needless parsed lines of code with actual PCs.
Additionally - it is far more work and source if you check every single header in every source-file that includes the header.
And you would have to handle your header-files different from default- and other third-party-headers.
He may have had an argument the time he was writing this. Nowadays decent compilers are clever enough to handle this well.
I agree with your approach - as others have commented, its clearer, self-documenting, and lower maintenance.
My theory on why Rob Pike might have suggested his approach: He's talking about C, not C++.
In C++, if you have a lot of classes and you are declaring each one in its own header file, then you'll have a lot of header files. C doesn't really provide this kind of fine-grained structure (I don't recall seeing many single-struct C header files), and .h/.c file-pairs tend to be larger and contain something like a module or a subsystem. So, fewer header files. In that scenario Rob Pike's approach might work. But I don't see it as suitable for non-trivial C++ programs.

Resources