Effect of including a header file in C - c

What is the effect of including a header file, none of the functions declared in which, are used in source file ? Does it affect stack size etc ?

It will have no effect, but it will increase compilation time and make the code harder to understand and maintain. You should only include the headers you actually need, and remove those which become redundant.

That depends on whether there are definitions in the header file or just declarations.
It also depends entirely on the implementation since the ISO C standard has nothing to say about how things are done at that level. It only covers how things appear to behave at a "C virtual machine" level. But I'll cover the most likely scenario here.
Definitions such as int xyzzy; (or, worse, char big_honkin_thing[9999999];) may take up space in the object file and, unless you have a particularly clever linker, the executable as well. I say "may" since this is dependent on the implementation.
Initialising the value is more likely to ensure it's stored in the object rather than created at runtime. But you're likely to find an effect regardless of that, either larger object/executable files if it's created at compile time or (mildly) slower startup times as more memory has to be zero-initialised.
For example, adding char big[99999] = {'x'}; to a header file results in the size of the executable going from 18K to 118K.
Simple declaration stuff like typedef and extern will not allocate space in the object in and of themselves.
In addition, even without definitions, the compile time will be increased since the compiler has to process that header file. But that will not have any impact on runtime (either speed or storage) itself.

The stack size is determined by your linker. Presumably you actually mean whether or not the emitted code is larger or not.
Including a header file whose declarations are never referenced in that translation unit will not affect the size of the generated objects. Of course, it will slow down compilation.

It will increase compile times, but AFAIK there shouldn't be any other changes.

Directly accessing the source file compare to accessing source file with header file will take less time if in the case there is nothing in header which can effect the source file.

Related

Problems of including too many header files in C

Does including too many header files increase the size of the source file. Does it also increase the size of executable? Do these header files increase the compilation time?
For example if i add these header files in my program do they increase of size of source file or executable file or both?
#include <stdio.h>
#include "header1.h"
#include "header2.h"
What are the other problems of including too many header files?
Does including too many header files increase the size of the source file.
It increases with as many letters as you type. So if 1 letter is 1 byte on your system, then adding #include <stdio.h> increases the source code file size by at least 18 bytes. This shouldn't matter to you unless you are using a computer from the mid-1980s.
Does it also increase the size of executable?
No. Only used functions increase the size of the executable.
Do these header files increase the compilation time?
Generally yes, though compilers use various tricks such as "precompiled headers" for its own libraries. Again, this isn't a problem unless you are using 1980s stuff or worse (such as Eclipse).
What are the other problems of including too many header files?
Your main concern about including headers should be to not include stuff that you don't use. Every include creates a dependency, and also means more identifiers and symbols added to the global namespace.
Does including too many header files increase the size of the source file.
Yes, For each additional character added to a source file, for example "#include <stdio.h>" increases the physical size of the source file precisely by the number of characters in that statement, eg": strlen("#include <stdio.h>"); bytes. (and, depending on how OS allocates file block size, it could be seen by the OS as an extra kByte.) More importantly though, at compile time the contents of each header file #included will effectively be expanded into source code that is fed to the compiler.
Does it also increase the size of executable?
Yes/No/Possibly. Depending on what is actually used in the header file. Optimizing compilers can exclude whatever is not needed in an executable. If nothing is used, there will no additional size to the executable. There will however be additional work done during compile-time because even if there is nothing useful in the header file, compiler does not know this until it is processed.
Do these header files increase the compilation time?
Compared to what? i.e. If there are, within a header file, necessary components to allow a build to occur, i.e. by containing prototypes of functions, #defines, etc, then compile time is just normal compile time. But if you have been compiling with say 3 necessary header files for awhile, and decide that you want to add a new library (and it's corresponding header file.), then by all means, yes, the next compile will take a little longer than those previous.
What are the other problems of including too many header files?
Too many?. If each and every header file is necessary, then there are not too many. (with this caveat about maximum header file depth.) However, if a header file is found to be unnecessary, its code bloat, get rid of it. It adds to compile time, as each header file regardless of whether there is anything useful in it has to be looked at by the build process. Even worse unnecessary header files add complexity and difficulty to the tasks of future maintainers.
There is a good post here discussing this in more detail.
Additionally, this is a fun page that also discusses header files.
Header files should not produce any extra code unless they are poorly designed, an d you will probably encounter redefinition problems if they do and you include them more than once. By code here I mean "machine code", that is executable code.
About the source code, the compiler ultimately sees the source code as one big source code with the #include directives replaced by the content of the file, so adding more header files will increase the compile time (as the apparent source code will be longer). So including unnesessary files should be avoided.
Adding header files will increase the size of the intermediate source file, taking into account the inclusions. Modern compilers may not even generate this intermediate file explicitly -- it may be absorbed into the overall compilation process. This is a matter of compiler design. As a developer, you probably won't ever see the fully-expanded file unless you ask for it (e.g., gcc -E).
Adding header files will not necessarily increase the size of the compiled code -- if all the headers contain is declarations and constant definitions, they won't increase the size much, if at all. If they contain actual code -- which isn't a particularly common practice -- they might have some small effect on the executable size.
Adding header files will probably have some effect on the compilation time but, really, this isn't a question anybody should be asking. If you need the headers, you need the headers. If it slows down compilation, what's the alternative? Don't compile?
If the question is really about how to distribute code between headers and source files, so as to improve some aspect of the build process, then that's a very complicated question to answer. If the question is about what harm is done by including a bunch of headers you don't use, the answer with modern compilers is: very little, from a functional perspective. However, including some header gives the reader the impression that the source actually uses the features it declares, and that's bad for readability. You should do your future self, or your colleagues, a favour and try not to include headers that aren't used. But if you need them, you need them, and there's little point worry about the consequences too much,
Does including too many header files increase the size of the source file?
The more characters in the source file, the more size has the source file.
But it's only about the #include directives itself. Not the content of header files - the source file doesn't get expanded by the content of the headers.
When you #include a header, the compiler gets known about to read it at that point of time, but the source file isn't changed.
So, yes.
Does it also increase the size of executable?
Depends on the content of the headers. If they contain definitions then yes.
Do these header files increase the compilation time?
The more to read and evaluate from the compiler, the longer the time to compile. So yes.
What are the other problems of including too many header files?
As said before, the time to evaluate might take longer and thus the more header files, the slower the compilation. But there is nothing wrong to add as much useful headers as you like. Just don't add unnecessary headers, which slow down the compilation.

Does a C compiler always, never, or sometimes exclude file-level arrays not touched by code?

We just encountered and solved an issue in which our RAM spiked when we included some code that accessed a certain large array. This leads to this follow on question: I apparently was under the misconception that C compilers excluded functions that weren't called, but didn't exclude arrays declared at the file level but weren't touched. I guess it makes all the sense in the world that it would do this, but I'm sure I've seen different behavior, just created an array and watched RAM usage jump (without writing code that touches the array). This was esp. shocking since we are at zero optimization.
So to learn the right lesson here: are arrays that are not touched always, never, or sometimes excluded by compilers. Does it depend on the compiler and the optimization level, or is this somehow tied to a C standard requirement? And am I crazy, or do most compilers seem to not exclude them?
Thanks.
As far as the C standard is concerned, C allows optimization, but a compiler need not optimize the code at all to be compliant.
As for how most systems work in practice, file scope variables are allocated in .data or .bss sections and have static storage duration, meaning that they have to get initialized to a value by the compiler before main() is called. This is a C standard requirement.
A compiler with optimizations disabled may therefore very well include such variables as part of the initialization code, regardless of if those variables are used or not. And most compilers have optimizations disabled as default.
You can help the compiler to do a better job at spotting unused variables by declaring them static - meaning that the variable gets "internal linkage" and no other file can touch it. If you don't, then the compiler might not be able to tell if the variable is used before compiling all other files, perhaps getting forced to leave that decision to the linker.
But overall, it isn't meaningful to ponder about what a compiler will do with optimizations off. If the unused variable is still allocated with optimizations enabled, that's when you should start to worry.
You say "compiler" but this is a function of the linker: only the linker can know if an array is not used by any of the compilation units (object files).
The linker knows this, if no object file has a reference to the array (the data) that the linker has to resolve.
What remains is the compilation unit that declares the array (the data). That unit does not have (or does not need to have) a reference to the data because it is declared in the compilation unit (object file).
Taken together, there may be no way for the compiler or linker to know if some data is not used and consequently the linker will need to include it in the executable.
Note: if the array is declared static, then the compiler can decide because the data will have no visibility outside the current compilation unit.

make preprocessor trace the source of a definition

In a large and complex pile of source (not invented here, hacked together by Elbonian code-slaves) it can be the case that several bits of code have their own local duplicate of some common header file in their path.
Due to the many layers of build and use of guard macros to prevent re-definition, a value #defined in one place may therefore be remembered by the compiler and used elsewhere despite a more "local" header #defining the same thing.
My question is: Can I get the C preprocessor to spit out the name/path of the file where it originally "found" the definition of something?

Included files, all or nothing?

If I #include a file in C, do I get the entire contents of the file linked in, or just the parts I use?
If it has 10 functions in it, and I only use one of the functions, does the code for the other nine functions get included in my executable? This is especially relevant for me right now as I am working on a microcontroller and memory is precious.
Firstly, header files do not get "linked in". #include is basically a textual copy-paste feature. Everything from your include file gets pasted by preprocessor into the final translation unit, which will later be seamlessly processed by the compiler proper. The compiler proper knows nothing about any header files or #include directives.
Secondly, it means that if in your code you declared or defined some function or variable that you do not use, it is completely irrelevant whether it came from a header file through #include or was written directly in source file. There's absolutely no difference.
Thirdly, the question is: what exactly do you have in your header file that you include? Typically, header files do not define objects and functions, they simply declare them. Declarations do not produce any code, regardless whether you use the function or not. Declarations simply tell the compiler that the code (generated from the function definition) already exists elsewhere. Thus, as long as we are talking about typical header files, #include directives and header files by themselves have no effect on final code size.
Fourthly, if your header file is of some unusual kind that contains function (or object) definitions, then see "firstly" and "secondly" above. The compiler proper can see only one translation unit at a time, for which reason a typical strategy for the compiler proper is to completely discard unused entities with internal linkage (i.e. static objects and functions) and keep all entities with external linkage. Entities with external linkage cannot be discarded by compiler proper, since they might be needed in some other translation unit.
Fifthly, at linking stage linker can see the program in its entirety and, for that reason, can discard unused objects and functions, if it is advanced enough for that (and if you allow linker to do it). Meanwhile, inclusion-exclusion precision of a typical run-of-the-mill linker is limited to a single object file. Each object file is atomic to such linker. This means that if you want to be able to exclude unused functions on per-function basis, you might have to adopt "one function per object file" strategy, i.e. write one and only one function per .c file. Of course, this is only possible when you write your own code. If some third-party library you want to use does not adhere to this convention, then you might not be able to exclude individual functions.
If you #include a file in C, the entire contents of that file are added to your source file and compiled by your compiler. A header file, though, usually only has declarations of functions and no definitions (so no actual code is compiled).
The linker, on the other hand, takes all the functions from all the libraries and compiled source code and merges them into the final output file. At this time, the linker will discard any functions that you aren't using.
So, to answer your question: only the functions you use (and indirectly depend on) will be included in your final program file, and this is independent of what files you #include. Happy hacking!
You have to distinguish between different scenarios:
What does the included header file contain? Declarations of external functions only, or also static function definitions?
How are the implementations of the external functions distributed which are declared in that the header file you include? Are they all implemented in one .c file, or distributed across several .c files?
Regarding point 1: Only by #includeing external declarations, no other code will become part of your object file. And, definitions of static functions that are part of the header file, but which are not referenced by your code, may not become part of your object file - this is an optimization that is fairly common. It depends on your compiler, however.
Regarding point 2: Some linkers can only link whole object files, all or nothing. That means, if all the external functions declared in a header file are implemented in one .c file, and, if your code references at least one of these functions, chances are that you will get the whole object file, including all the other functions you don't use. Some linkers, however, can avoid this and remove unused parts when linking object files.
One brute-force approach to deal with non-optimizing linkers is, to put every external function into a .c file of its own. You will, however, have to find a way to deal with the situation that some of these functions refer to a common static function that is part of the original .c file...
Include simply presents the compiler ultimately with what looks like a single file (and if you do save-temps on GCC you will see that exactly a single file is presented to the actual compiler). It is no more complicated than that. So if you have some function prototypes or defines in your .c file then having them come from an include makes no difference whatsoever; the end result is the same.
If the things you include include code, functions, and not just prototypes, then it is the same as if you had those in the .c file itself. Whether or not those show up in the final binary has to do with whether or not you declared them as global or not using static, and then whether or not you optimized, etc. The same goes for variables and structures and other things.
Not all linkers are the same, but a common way to do it is whatever the compiler left in the object goes into the final binary. But if you take those objects and make a library out of them then some/many(?) linkers don’t suck everything into the binary on the portions that are required to resolve the dependencies.

#include and what actually compiles

This is just a general compiler question, directed at C based languages.
If I have some code that looks like this:
#include "header1.h"
#include "header2.h"
#include "header3.h"
#include "header4.h" //Header where #define BUILD_MODULE is located
#ifdef BUILD_MODULE
//module code to build
#endif //BUILD_MODULE
Will all of the code associated with those headers get built even if BUILD_MODULE is not defined? The compiler just "pastes" the contents of headers correct? So this would essentially build a useless bunch or header code that just takes up space?
All of the text of the headers will be included in the compilation, but they will generally have little or no effect, as explained below.
C does not have any concept of “header code”. A compilation of the file in the question would be treated the same as if the contents of all the included files appeared in a single file. Then what matters is whether the contents define any objects or functions.
Most declarations in header files are (as header files are commonly used) just declarations, not definitions. They just tell the compiler about things; they do not actually cause objects or code to be created. For the most part, a compiler will not generate any data or code from declarations that are not definitions.
If the headers define external objects or functions, the compiler must generate data (or space) or code for them, because these objects or functions could be referred to from other source files to be compiled later and then linked with the object produced from the current compilation. (Some linkers can determine that external objects or functions are not used and discard them.)
If the headers define static objects or functions (to be precise, objects with internal or no linkage), then a compiler may generate data or code for these. However, the optimizer should see that these objects and functions are not referenced, and therefore generation may be suppressed. This is a simple optimization, because it does not require any complicated code or data analysis, simply an observation that nothing depends on the objects or functions.
So, the C standard does not guarantee that no data or code is generated for static objects or functions, but even moderate quality C implementations should avoid it, unless optimization is disabled.
Depends on the actual compiler. Optimizing compilers will not generate the output for unrequired code, whereas dumber compilers will.
gcc (a very common c compiler for open-source platforms) will optimize your code with the -O option, which will not generate unneeded expressions.
Code in #ifdef statements where the target is not defined will never generate output, as this would violate the language specifications.
Conceptually, at least, include/macro processing is a separate step from compilation. The main source file is read and a new temporary file is constructed containing all the included code. If anything is "#ifdefed out" then that code is not included in the temporary file. At the same time, the occurrences of macro names are replaced with the text they "expand" into. It is that resulting file, with all the includes included, etc, that is fed into the actual compiler.
Some compilers do this literally (and you can even "capture" the intermediate file) while others sort of simulate it (and actually require an entire separate step if you request that the intermediate file be produced). But most compilers have one means or another of producing the file for your examination.
The C/C++ standards lay out some rather arcane rules that must be followed to assure that any "simulated" implementation doesn't somehow change the behavior of the resulting code, vs the "literal" approach.

Resources