In a large and complex pile of source (not invented here, hacked together by Elbonian code-slaves) it can be the case that several bits of code have their own local duplicate of some common header file in their path.
Due to the many layers of build and use of guard macros to prevent re-definition, a value #defined in one place may therefore be remembered by the compiler and used elsewhere despite a more "local" header #defining the same thing.
My question is: Can I get the C preprocessor to spit out the name/path of the file where it originally "found" the definition of something?
Related
I understand that it's good practice to split up a C program into multiple .h and .c files, but since each .h file has an #ifndef #define #endif "include guard", don't all those defined constants take up memory in my final program? If I'm really trying to be conservative with memory usage, would combining my final program into one large C file to get rid of all the "include guards" help?
The include guards are not part of the content of the final program (of the object modules produced by the compiler or of the executable file produced by the linker). The include guards are instructions to the compiler during compilation; they do not contribute data to the final program file.
Constants defined with #define also do not appear as data in the final program.
It is possible that some objects define in headers may result in multiple objects being defined in the final program. For example, static int x = 3; could result in multiple occurrences of this x appearing in the final program. Generally, headers should avoid defining objects; they should only declare identifiers for objects that are defined in source files.
Preprocessor macros don't normally use any additional memory or storage, so reducing the use of these things is probably not be a good reason to combine multiple source files together.
However, there might be other good reasons to do so, in large projects. For example, the maintainers of SQLite3 claim (see
https://www.sqlite.org/amalgamation.html) that merging all the sources together leads to 5-10% faster execution. Presumably this is because it allows for better compiler optimization.
However, it's very difficult to maintain vast C source files, and they take an age to compile.
If I #include a file in C, do I get the entire contents of the file linked in, or just the parts I use?
If it has 10 functions in it, and I only use one of the functions, does the code for the other nine functions get included in my executable? This is especially relevant for me right now as I am working on a microcontroller and memory is precious.
Firstly, header files do not get "linked in". #include is basically a textual copy-paste feature. Everything from your include file gets pasted by preprocessor into the final translation unit, which will later be seamlessly processed by the compiler proper. The compiler proper knows nothing about any header files or #include directives.
Secondly, it means that if in your code you declared or defined some function or variable that you do not use, it is completely irrelevant whether it came from a header file through #include or was written directly in source file. There's absolutely no difference.
Thirdly, the question is: what exactly do you have in your header file that you include? Typically, header files do not define objects and functions, they simply declare them. Declarations do not produce any code, regardless whether you use the function or not. Declarations simply tell the compiler that the code (generated from the function definition) already exists elsewhere. Thus, as long as we are talking about typical header files, #include directives and header files by themselves have no effect on final code size.
Fourthly, if your header file is of some unusual kind that contains function (or object) definitions, then see "firstly" and "secondly" above. The compiler proper can see only one translation unit at a time, for which reason a typical strategy for the compiler proper is to completely discard unused entities with internal linkage (i.e. static objects and functions) and keep all entities with external linkage. Entities with external linkage cannot be discarded by compiler proper, since they might be needed in some other translation unit.
Fifthly, at linking stage linker can see the program in its entirety and, for that reason, can discard unused objects and functions, if it is advanced enough for that (and if you allow linker to do it). Meanwhile, inclusion-exclusion precision of a typical run-of-the-mill linker is limited to a single object file. Each object file is atomic to such linker. This means that if you want to be able to exclude unused functions on per-function basis, you might have to adopt "one function per object file" strategy, i.e. write one and only one function per .c file. Of course, this is only possible when you write your own code. If some third-party library you want to use does not adhere to this convention, then you might not be able to exclude individual functions.
If you #include a file in C, the entire contents of that file are added to your source file and compiled by your compiler. A header file, though, usually only has declarations of functions and no definitions (so no actual code is compiled).
The linker, on the other hand, takes all the functions from all the libraries and compiled source code and merges them into the final output file. At this time, the linker will discard any functions that you aren't using.
So, to answer your question: only the functions you use (and indirectly depend on) will be included in your final program file, and this is independent of what files you #include. Happy hacking!
You have to distinguish between different scenarios:
What does the included header file contain? Declarations of external functions only, or also static function definitions?
How are the implementations of the external functions distributed which are declared in that the header file you include? Are they all implemented in one .c file, or distributed across several .c files?
Regarding point 1: Only by #includeing external declarations, no other code will become part of your object file. And, definitions of static functions that are part of the header file, but which are not referenced by your code, may not become part of your object file - this is an optimization that is fairly common. It depends on your compiler, however.
Regarding point 2: Some linkers can only link whole object files, all or nothing. That means, if all the external functions declared in a header file are implemented in one .c file, and, if your code references at least one of these functions, chances are that you will get the whole object file, including all the other functions you don't use. Some linkers, however, can avoid this and remove unused parts when linking object files.
One brute-force approach to deal with non-optimizing linkers is, to put every external function into a .c file of its own. You will, however, have to find a way to deal with the situation that some of these functions refer to a common static function that is part of the original .c file...
Include simply presents the compiler ultimately with what looks like a single file (and if you do save-temps on GCC you will see that exactly a single file is presented to the actual compiler). It is no more complicated than that. So if you have some function prototypes or defines in your .c file then having them come from an include makes no difference whatsoever; the end result is the same.
If the things you include include code, functions, and not just prototypes, then it is the same as if you had those in the .c file itself. Whether or not those show up in the final binary has to do with whether or not you declared them as global or not using static, and then whether or not you optimized, etc. The same goes for variables and structures and other things.
Not all linkers are the same, but a common way to do it is whatever the compiler left in the object goes into the final binary. But if you take those objects and make a library out of them then some/many(?) linkers don’t suck everything into the binary on the portions that are required to resolve the dependencies.
Can I include a first.c file into another second.c? (I am doing some socket programming to store the messages received by server in linked list so in first program I am trying to keep linked list and second program socket programming file to access the data of first in second). What kind of data in first file can be accessed in the second file? Is this is a good practice?
Please explain about the user defined .h files and give me an example for both.
C language is a low level permissive language. If the programmer wants to do weird things the compiler won't to anything to stop it to do.
Your question is of that flavour : you can include first.c in second.c, neither the compiler nor the linker will protest. And in simple cases (only 2 source files) it will work the same. You could also rename first.c to first.h and include it. All that are simply convention ... and good practices.
Because never ever do that (except in very special cases as suggested by Jonathan Leffler). You make the separate compilation rules break in pieces. When you include a file, it is (from the compiler point of view) the same as including it in you text editor. You know you can always have a single monolithic source file, and you should know (or you will soon if you try ...) that it is hard to test and error prone because you have only 2 scopes : global and local to function, and it could easily lead to poorly structured programming.
The great ancients found better to have smaller source files, easier to write, test, and read and understand, and the include files contains the smallest part necessary to allow the separate sources to communicate : normally only declarations and constants, seldom global variables.
The conclusion is nothing more than you got in comments: yes you can, but your surely will not want to do that.
This is just a general compiler question, directed at C based languages.
If I have some code that looks like this:
#include "header1.h"
#include "header2.h"
#include "header3.h"
#include "header4.h" //Header where #define BUILD_MODULE is located
#ifdef BUILD_MODULE
//module code to build
#endif //BUILD_MODULE
Will all of the code associated with those headers get built even if BUILD_MODULE is not defined? The compiler just "pastes" the contents of headers correct? So this would essentially build a useless bunch or header code that just takes up space?
All of the text of the headers will be included in the compilation, but they will generally have little or no effect, as explained below.
C does not have any concept of “header code”. A compilation of the file in the question would be treated the same as if the contents of all the included files appeared in a single file. Then what matters is whether the contents define any objects or functions.
Most declarations in header files are (as header files are commonly used) just declarations, not definitions. They just tell the compiler about things; they do not actually cause objects or code to be created. For the most part, a compiler will not generate any data or code from declarations that are not definitions.
If the headers define external objects or functions, the compiler must generate data (or space) or code for them, because these objects or functions could be referred to from other source files to be compiled later and then linked with the object produced from the current compilation. (Some linkers can determine that external objects or functions are not used and discard them.)
If the headers define static objects or functions (to be precise, objects with internal or no linkage), then a compiler may generate data or code for these. However, the optimizer should see that these objects and functions are not referenced, and therefore generation may be suppressed. This is a simple optimization, because it does not require any complicated code or data analysis, simply an observation that nothing depends on the objects or functions.
So, the C standard does not guarantee that no data or code is generated for static objects or functions, but even moderate quality C implementations should avoid it, unless optimization is disabled.
Depends on the actual compiler. Optimizing compilers will not generate the output for unrequired code, whereas dumber compilers will.
gcc (a very common c compiler for open-source platforms) will optimize your code with the -O option, which will not generate unneeded expressions.
Code in #ifdef statements where the target is not defined will never generate output, as this would violate the language specifications.
Conceptually, at least, include/macro processing is a separate step from compilation. The main source file is read and a new temporary file is constructed containing all the included code. If anything is "#ifdefed out" then that code is not included in the temporary file. At the same time, the occurrences of macro names are replaced with the text they "expand" into. It is that resulting file, with all the includes included, etc, that is fed into the actual compiler.
Some compilers do this literally (and you can even "capture" the intermediate file) while others sort of simulate it (and actually require an entire separate step if you request that the intermediate file be produced). But most compilers have one means or another of producing the file for your examination.
The C/C++ standards lay out some rather arcane rules that must be followed to assure that any "simulated" implementation doesn't somehow change the behavior of the resulting code, vs the "literal" approach.
What is the effect of including a header file, none of the functions declared in which, are used in source file ? Does it affect stack size etc ?
It will have no effect, but it will increase compilation time and make the code harder to understand and maintain. You should only include the headers you actually need, and remove those which become redundant.
That depends on whether there are definitions in the header file or just declarations.
It also depends entirely on the implementation since the ISO C standard has nothing to say about how things are done at that level. It only covers how things appear to behave at a "C virtual machine" level. But I'll cover the most likely scenario here.
Definitions such as int xyzzy; (or, worse, char big_honkin_thing[9999999];) may take up space in the object file and, unless you have a particularly clever linker, the executable as well. I say "may" since this is dependent on the implementation.
Initialising the value is more likely to ensure it's stored in the object rather than created at runtime. But you're likely to find an effect regardless of that, either larger object/executable files if it's created at compile time or (mildly) slower startup times as more memory has to be zero-initialised.
For example, adding char big[99999] = {'x'}; to a header file results in the size of the executable going from 18K to 118K.
Simple declaration stuff like typedef and extern will not allocate space in the object in and of themselves.
In addition, even without definitions, the compile time will be increased since the compiler has to process that header file. But that will not have any impact on runtime (either speed or storage) itself.
The stack size is determined by your linker. Presumably you actually mean whether or not the emitted code is larger or not.
Including a header file whose declarations are never referenced in that translation unit will not affect the size of the generated objects. Of course, it will slow down compilation.
It will increase compile times, but AFAIK there shouldn't be any other changes.
Directly accessing the source file compare to accessing source file with header file will take less time if in the case there is nothing in header which can effect the source file.