Should header files have #includes?
I'm generally of the opinion that this kind of hierarchical include is bad. Say you have this:
foo.h:
#include <stdio.h> // we use something from this library here
struct foo { ... } foo;
main.c
#include "foo.h"
/* use foo for something */
printf(...)
The day main.c's implementation changes, and you no longer use foo.h, the compilation will break and you must add <stdio.h> by hand.
Versus having this:
foo.h
// Warning! we depend on stdio.h
struct foo {...
main.c
#include <stdio.h> //required for foo.h, also for other stuff
#include "foo.h"
And when you stop using foo, removing it breaks nothing, but removing stdio.h will break foo.h.
Should #includes be banned from .h files?
You've outlined the two main philosophies on this subject.
My own opinion (and I think that's all that one can really have on this) is that headers should as self-contained as possible. I don't want to have to know all the dependencies of foo.h just to be able to use that header. I also despise having to include headers in a particular order.
However, the developer of foo.h should also take responsibility for making it as dependency-free as possible. For example, the foo.h header should be written to be free of a dependency on stdio.h if that's at all possible (using forward declarations can help with that).
Note that the C standard forbids a standard header from including another standard header, but the C++ standard doesn't. So you can see the problem you describe when moving from one C++ compiler version to another. For example, in MSVC, including <vector> used to bring in <iterator>, but that no longer occurs in MSVC 2010, so code that compiled before might not any more becuase you may need to specifically include <iterator>.
However, even though the C standard might seem to advocate the second philosophy, note that it also mandates that no header depend on another and that you can include headers in any order. So you get the best of both worlds, but at a cost of complexity to the implementers of the C library. They have to jump through some hoops to do this (particularly to support definitions that can be brought in through any of several headers, like NULL or size_t). I guess that the people who drafted the C++ standard decided adding that complexity to impersonators was no longer reasonable (I don't know to what degree C++ library implementors take advantage of the 'loophole' - it looks like MS might be tightening this up, even if it's not technically required).
My general recommendations are:
A file should #include what it needs.
It should not expect something else to #include something it needs.
It should not #include something it doesn't need because something else might want it.
The real test is this: you should be able to compile a source file consisting of any single #include and get no errors or warnings beyond "There is no main()". If you pass this test, then you can expect anything else to be able to #include your file with no problems. I've written a short script called "hcheck" which I use to test this:
#!/usr/bin/env bash
# hcheck: Check header file syntax (works on source files, too...)
if [ $# -eq 0 ]; then
echo "Usage: $0 <filename>"
exit 1
fi
for f in "$#" ; do
case $f in
*.c | *.cpp | *.cc | *.h | *.hh | *.hpp )
echo "#include \"$f\"" > hcheck.cc
printf "\n\033[4mChecking $f\033[0m\n"
make -s $hcheck.o
rm -f hcheck.o hcheck.cc
;;
esac
done
I'm sure there are several things that this script could do better, but it should be a good starting point.
If this is too much, and if your header files almost always have corresponding source files, then another technique is to require that the associated header be the first #include in the source file. For example:
Foo.h:
#ifndef Foo_h
#define Foo_h
/* #includes that Foo.h needs go here. */
/* Other header declarations here */
#endif
Foo.c:
#include "Foo.h"
/* other #includes that Foo.c needs go here. */
/* source code here */
This also shows the "include guards" in Foo.h that others mentioned.
By putting #include "Foo.h" first, Foo.h must #include its dependencies, otherwise you'll get a compile error.
Well, main shouldn't rely on "foo.h" in the first place for stdio. There's no harm in including something twice.
Also, perhaps foo.h doesn't really need stdio. What's more likely is that foo.c (the implementation) needs stdio.
Long story short, I think everyone should just include whatever they need and rely on include guards.
Once you get into projects with hundreds or thousands of header files, this gets untenable. Say I have a header file called "MyCoolFunction.h" that contains the prototype for MyCoolFunction(), and that function takes pointers to structs as parameters. I should be able to assume that including MyCoolFunction.h will include everything that's necessary and allow me to use that function without looking in the .h file to see what else I need to include.
If the header file needs a specific header, add it to the header file
#ifndef HEADER_GUARD_YOUR_STYLE
#define HEADER_GUARD_YOUR_STYLE
#include <stdio.h> /* FILE */
int foo(FILE *);
#endif /* HEADER GUARD */
if the code file doesn't need a header, don't add it
/* #include <stdio.h> */ /* removed because unneeded */
#include <stddef.h> /* NULL */
#include "header.h"
int main(void) {
foo(NULL);
return 0;
}
Why don't you #include stuff in the *.c file corresponding to the header?
Related
I have a header foo.h file that declares a function prototype
void foo(FILE *f);
/* ... Other things that don't depend on FILE ... */
among other things.
Now obviously, to use this header, I need to do the following
#include <stdio.h>
#include "foo.h"
I would like to surround this particular prototype with something like the following:
#ifdef _STDIO_H
void foo(FILE *f);
#endif
/* ... Other things that don't depend on FILE ... */
so that I can #include "foo.h" without worrying about #include <stdio.h> in cases where I don't need that particular function.
Is the #ifdef _STDIO_H the way to go if I want my code to be portable and standards compliant?
I could find no mention of _STDIO_H in the standards document, but I see it is used in a variety of C libraries. Should I rather use something that I know to be defined in stdio.h, like EOF?
A related question: What do you do for other standard C headers, like stdlib.h?
<stdio.h> and <stdlib.h> are part of the C99 (and C11) standards. So every (hosted) standard conforming C implementation have them.
On most practical implementations, they are header files with some include guards.
A standard conforming implementation might process #include <stdio.h> very specifically, e.g. by using some database. I know no such implementation.
So simply add
#include <stdio.h>
near the top of your header file, something like
// file foo.h
#ifndef FOO_INCLUDED
#define FOO_INCLUDED
#include <stdio.h>
// other includes ...
// ...
// other stuff
#endif /* FOO_INCLUDED */
Alternatively, you could not care and document that #include "foo.h" requires a previous #include <stdio.h>; any sensible developer using a good-enough C implementation would be able to take care of that.
Actually, I was wrong in my comment on Alter Mann's deleted answer. It looks like stdin is required to be some macro, and then you might use #ifdef stdin ... endif as Alter Mann correctly answered. I believe it is not very readable, and you just want to have <stdio.h> included, either by including it yourself in your foo.h or by requiring it in your documentation.
Contrarily to C++ standard headers, C standard headers are in practice quite quick to be compiled, so I don't think it is worth to optimize the unusual case when <stdio.h> has not been included.
Open your stdio.h file (for your compiler) and see whether it has _STDIO_H or similar definition.
If I have a c project where my main program needs file1 and file2 but file2 also needs file1. Is there a way I can get around including file2 in both main and file1? If I have an include guard, will this prevent file1.c from being added twice?
//file1.h
#ifndef FILE1_H
#define FILE1_H
void func1(void);
#endif
--
//file1.c
#include "file1.h"
void func1(void) {
..do something
}
--
//file2.h
#ifndef FILE2_H
#define FILE2_H
void func2(void);
#endif
--
//file2.c
#include "file2.h"
#include "file1.h"
void func2(void) {
..do something
func1();
}
--
//main.c
#include "file1.h"
#include "file2.h"
int main(void) {
func1();
func2();
return 0;
}
-- Since file2 includes file1, can I do this? will it prevent repetition of file1 code?
//main.c (alternate)
#include "file2.h"
int main(void) {
func1();
func2();
return 0;
}
I'm not too concerned about problems arising if file2 decides to no longer include file1 in the future. I'm much more concerned with wasted space.
What I'd like to know is A: does the include guard prevent the code duplication and if so, there is no additional space used by including file1 in both main.c and file2.c. B: in the case that extra space is being used, will my alternate main.c work?
Quick explanation (with the note that all of this can be overwritten by people that know what they are doing):
First of all, two definitions: declaration is when you write down that something exists. For example, "int foo();" or "struct bar;". Note that we can't actually use this thing yet, we've just given it a name. As long as you declare them as the same thing, you can declare things as many times as you want! (variable declaration has its own rules).
Anything you want to use needs to be declared before you reference it.
definition is when you say what the declaration is. int foo() {asdfadf;} or struct bar{int x;}. Things can be (and often are) defined when they are declared, but not always.
In C, you must follow the One Definition Rule. Things can be declared as often as you like, but they can be only defined once per translation unit (defined in one sec). (in addition, function calls can only be declared once per entire executable).
There are very few things that need to be defined before you use them...other than variables, you only need to define a struct before you use it in a context where you need its size or access to its members.
What is a translation unit? It is all the files used to compile a single source file. Your header files aren't targeted for compilation. Only your .c files (called "source files") are. For each c file, we have the idea of a "translation unit", which is all the files that are used to compile that c file. The ultimate output of that code is a .o file. A .o files contains all the symbols required to run the code defined in that c++ file. So your c file and any files included are withing the header file. Note: not everything declared in the translation unit needs to be defined in it to get a valid .o file.
So what is in a header file? Well (in general) you have a few things:
function declarations
global definitions & declarations
struct definitions & declarations
Basically, you have the bare bones declarations and definitions that need to be shared between the translation units. #include allows you to keep this in one shared file, rather than copying and pasting this code all over.
Your definitions can only happen once, so a include guard prevents that from being a problem. But if you only have declarations, you don't technically need and include guard. (You should still use them anyway, they can limit the cross-includes you do, as well as work as a guarantee against infinitely recursive inclusion). However, you do need to include all declarations relative to each translation unit, so you will most likely include it multiple times. THIS IS OK. At-least the declaration is in one file.
When you compile a .o file, the compiler checks that you followed the one definition rule, as well as all your syntax is correct. This is why you'll get these types of errors in "creating .o" steps of compilation.
So in your example, after we compile, we get file1.o (containing the definition of func1), file2.o (containing the definition of func2), and main.o (containing the definition of main). The next step is to link all these files together, using the linker. When we do, the compiler takes all these .o files, and makes sure that there is only one definition for each function symbol in the file. This is where the magic of letting main.o know what is in file1.o and file2.o happens: it resolves the "unresolved symbols" and detects when there are conflicting symbols.
Final Thought:
Keeping code short is kindof a misguided task. You want your code to be maintainable and readable, and making the code as short as possible is about the opposite of that. I can write a whole program on one line with only single letter alpha-numberic variables names, but no one would ever know what it did...what you want to avoid is code duplication in things like declarations. Maintaining a long list of #includes can become tricky, so it is often good to group related functions together (A good rule of thumb is that if I almost always use A and B together) then they should probably be in the same header file.
Another thing I occasionally (occasionally because it has some serious drawbacks) is to use a convenience header file:
//convience.h
#ifndef CONVIENIENCE_H
#define CONVIENIENCE_H
#include "file1.h"
#include "file2.h"
#endif
The convenience header file only has other header files in it, which ensures that it NEVER contains code, which makes it a little easier to maintain, but still kindof a mess. Also note that if you do the include guards in file1 and file2, the convienience guard isn't nessisary, though it can (theoretically) speed up compilation.
Why can't you have a single header where you can put both your functions func1() and func2().
Just include the header in different files.
Didn't get what you mean by code duplication.
//file1.h
extern void func1();
extern void func2();
//file1.c
#include<file1.h>
void func1()
{`
enter code here`
}
//file2.c
#include<file1.h>
void func2()
{
}
//main.c
#include <file1.h>
main()
{
func1();
func2();
}
I am still new to C. I have a question regarding source and header files. I have a header file like this:
#ifndef MISC_H_
#define MISC_H_
#define BYTE 8
#include <stdbool.h>
#include <stdio.h>
#include "DataStruct.h"
bool S_areEqual(char *firstString, char *secondString); /* (1) */
bool S_randomDataStructureCheck(char *string, DataStruct *data); /* (2) */
#endif
bool is used in the function parameters, and thus, I have it in the source as well. Do I have to #include <stdbool.h> in both the header file and the source file? Are there circumstances where I would and where I wouldn't?
What if I had a typedef in another header file that was used in the header as a function parameter? Do I have to #include "DataStruct.h" in both the header file and the source file?
What is the standard?
No you don't have to include in both header and source (.c) file. If you have included in a header that is included by the source then it will be available to the source as well. The order of header inclusion can be important when some headers depend on others. See this answer for more detail.
As an aside, you will notice the lines
#ifndef MISC_H_
#define MISC_H_
That ensures that the header is only included once.
Update
From the comments:
so in the source, you just include its respective header?
If you mean, should a source file only include its respective header, then it depends. Generally, files should include the files that they need.
If a source file needs a header, but that header is not needed by its own header file, then the include should go in the source rather than its header. One reason is that it is just conceptually cleaner that each file includes only the files that it needs, so it is easy to tell what the dependencies are. The other reason is that it reduces the impacts of change.
Lets look at an example. Say you have foo.c and foo.h and foo.c needs foodep.h to compile, but foo.h doesn't:
Option 1: foo.h
#include "foodep.h"
Now imagine that there are a number of other files foo1.h, foo2.h, foo3.h, etc that include foo.h. Then, any change to foodep.h affects all of those other files and their dependent header and source files.
Option 2: foo.c
#include "foodep.h"
Now, no other files have visibility of foodep.h. A change to foodep.h only impacts foo.c.
Generally, try to apply the same practices as you do with Object Oriented programming - encapsulate and minimize the scope of change.
The simple way to view this is that you should always include the headers which provide functions/aliases/macros you are using in your program, whether they actually need to be included should be left to compiler.
This is because every header is defined under #ifdef - #endif clause conditioned on some header-specific MACRO (And it is necessary to do this if you define your own header, to avoid multiple inclusions and thus avoid painful compiler errors).
Thus, my advice, if you are using bool in your program, you should include stdbool.h. If the compiler has already included it in definition of some other header, it will not include stdbool again.
Let's assume I define BAR in foo.h. But foo.h might not exist. How do I include it, without the compiler complaining at me?
#include "foo.h"
#ifndef BAR
#define BAR 1
#endif
int main()
{
return BAR;
}
Therefore, if BAR was defined as 2 in foo.h, then the program would return 2 if foo.h exists and 1 if foo.h does not exist.
In general, you'll need to do something external to do this - e.g. by doing something like playing around with the search path (as suggested in the comments) and providing an empty foo.h as a fallback, or wrapping the #include inside a #ifdef HAS_FOO_H...#endif and setting HAS_FOO_H by a compiler switch (-DHAS_FOO_H for gcc/clang etc.).
If you know that you are using a particular compiler, and portability is not an issue, note that some compilers do support including a file which may or may not exist, as an extension. For example, see clang's __has_include feature.
Use a tool like GNU Autoconf, that's what it's designed for. (On windows, you may prefer to use CMake).
So in your configure.ac, you'd have a line like:
AC_CHECK_HEADERS([foo.h])
Which, after running configure, would define HAVE_FOO_H, which you can test like this:
#ifdef HAVE_FOO_H
#include "foo.h"
#else
#define BAR 1
#endif
If you intend to go down the autotools route (that is autoconf and automake, because they work well together), I suggest you start with this excellent tutorial.
If I have a several header files :lets say 1.h, 2.h, 3.h.
Let's say the all three of the header files have #include <stdlib.h> and one of the include files in them.
When I have to use all 3 header files in a C file main.c,
it will have 3 copies of #include <stdlib.h> after the preprocessor.
How does the compiler handle this kind of conflict?
Is this an error or does this create any overhead?
If there are no header guards, what will happen?
Most C headers include are wrapped as follows:
#ifndef FOO_H
#define FOO_H
/* Header contents here */
#endif
The first time the preprocessor scans this, it will include the contents of the header because FOO_H is undefined; however, it also defines FOO_H preventing the header contents from being added a second time.
There is a small performance impact of having a header included multiple times: the preprocessor has to go to disk and read the header each time. This can be mitigated by adding guards in your C file to include:
#ifndef FOO_H
#include <foo.h>
#endif
This stuff is discussed in great detail in Large-Scale C++ Software Design (an excellent book).
This is usually solved with preprocessor statements:
#ifndef __STDLIB_H
#include <stdlib.h>
#define __STDLIB_H
#endif
Although I never saw it for common header files like stdlib.h, so it might just be necessary for your own header files.
The preprocessor will include all three copies, but header guards will prevent all but the first copy from being parsed.
Header guards will tell the preprocessor to convert subsequent copies of that header file to effectively nothing.
Response to edit:
Standard library headers will have the header guards. It would be very unusual and incorrect for them to not have the guards.
Similarly, it is your responsibility to use header guards on your own headers.
If header guards are missing, hypothetically, you will get a variety of errors relating to duplicate definitions.
Another point: You can redeclare a function (or extern variable) a bazillion times and the compiler will accept it:
int printf(const char*, ...);
int printf(const char*, ...);
is perfectly legal and has a small compilation overhead but no runtime overhead.
That's what happens when an unguarded include file is included more than once.
Note that it is not true for everything in an include file. You can't redeclare an enum, for example.
This is done by one of the two popular techniques, both of which are under stdlib's responsibility.
One is defining a unique constant and checking for it, to #ifdef out all the contents of the file if it is already defined.
Another is microsoft-specific #pragma once, that has an advantage of not having to even read the from the hard drive if it was already included (by remembering the exact path)
You must also do the same in all header files you produce. Or, headers that include yours will have a problem.
As far a I know regular include simply throws in the contents of another file. The standard library stdlib.h urely utilizes the code guards: http://en.wikipedia.org/wiki/Include_guard, so you end up including only one copy. However, you can break it (do try it!) if you do: #include A, #undef A_GUARD, #include A again.
Now ... why do you include a .h inside another .h? This can be ok, at least in C++, but it is best avoided. You can use forward declarations for that: http://en.wikipedia.org/wiki/Forward_declaration
Using those works for as long as your code does not need to know the size of an imported structure right in the header. You might want to turn some function arguments by value into the ones by reference / pointer to solve this issue.
Also, always utilize the include guards or #pragma once for your own header files!
As others have said, for standard library headers, the system must ensure that the effect of a header being included more than once is the same as the header being included once (they must be idempotent). An exception to that rule is assert.h, the effect of which can change depending upon whether NDEBUG is defined or not. To quote the C standard:
Standard headers may be included in any order; each may be included more than once in
a given scope, with no effect different from being included only once, except that the
effect of including <assert.h> depends on the definition of NDEBUG.
How this is done depends upon the compiler/library. A compiler system may know the names of all the standard headers, and thus not process them a second time (except assert.h as mentioned above). Or, a standard header may include compiler-specific magic (mostly #pragma statements), or "include guards".
But the effect of including any other header more than once need not be same, and then it is up to the header-writer to make sure there is no conflict.
For example, given a header:
int a;
including it twice will result in two definitions of a. This is a Bad Thing.
The easiest way to avoid conflict like this is to use include guards as defined above:
#ifndef H_HEADER_NAME_
#define H_HEADER_NAME_
/* header contents */
#endif
This works for all the compilers, and doesn't rely of compiler-specific #pragmas. (Even with the above, it is a bad idea to define variables in a header file.)
Of course, in your code, you should ensure that the macro name for include guard satisfies this:
It doesn't start with E followed by an uppercase character,
It doesn't start with PRI followed by a lowercase character or X,
It doesn't start with LC_ followed by an uppercase character,
It doesn't start with SIG/SIG_ followed by an uppercase character,
..etc. (That is why I prefer the form H_NAME_.)
As a perverse example, if you want your users guessing about certain buffer sizes, you can have a header like this (warning: don't do this, it's supposed to be a joke).
#ifndef SZ
#define SZ 1024
#else
#if SZ == 1024
#undef SZ
#define SZ 128
#else
#error "You can include me no more than two times!"
#endif
#endif