Correctly Using Header Files? - c

Lately I have been using header files to split up my program into separate files, (C files containing functions and header files declaring them). Every thing works fine but for some reason, I need to include <stdio.h> and <stdlib.h> in EVERY C file... or my project fails to compile. Is this expected behavior?

C modules need to know either how something is defined, or where it can find a definition. If the definition is in the header file, then you should include it in the modules that use it. Here is a link to information regarding header files.

The answer would depend on whether or not that functions might depend on other declared functions in other .c/.h files.
For example:
filea.c:
#include "filea.h";
methodA()
{
methodB();
}
fileb.c:
#include <somelibrary.h>
#include "fileb.h"
methodB();
{
somelibrarycode();
}
This will not compile unless filea.c includes the header for fileb.h as it has some external dependency that is not resolved.
If this is not what you're describing than there is some other spaghettification happening, or you accidentally statically typed functions preventing them from being seen outside of the .c file.
One possible solution to this problem is to have a single shared.h with all the other includes, but I personally don't recommend this as this merely masks the issue instead of making it readily apparently which files depend on what and establish clear lines of dependency.

They must be included some way.
Some projects require long list of includes in .c files, possibly with mandatory sort, even forcing assumption that no header includes any other header.
Some allow assuming some includes form some headers.
Some use collection headers (that include a list of small headers) and replace long lists with those.
Some go even further, using "forced header" option of compiler, so include will not appear anywhere, and declare the content to be implicitly assumed. It may go on project or whole codebase level, or combined. It plays pretty well with precompiled headers.
(And there are many more strategies, you get the figure, all with some pros&cons.)

Related

Which file should include libraries in my c project?

I'm writing a Pong game in C using ncurses. I placed function definitions for the ball, the player, and the AI opponent into ball.c, player.c, and ai.c respectively. Each of these files includes another file, pong.h, which contains function prototypes, structure definitions, and global variables. My main function is in pong.c, which provides the game loop and handles keypresses.
My project also includes a number of libraries: ncurses.h, stdlib.h, and time.h. Where should I include these libraries? Currently, they are included in pong.h, like so:
#ifndef _PONG_H
#define _PONG_H
#include <ncurses.h>
#include <stdlib.h>
#include <time.h>
/* everything else */
#endif
However, only certain files make use of functions in stdlib.h/time.h. This leads me to believe that it might make more sense to only include one-use libraries in the file where they are used. On the other hand, including all libraries in one place is more straightforward.
I'm wondering if there is a way to do this which is considered more conventional or efficient.
There is no firm rule, instead you should balance convenience and hygiene. You're already aware that the meta include is more convenient for your other .c files, but I'll emphasize some more obscure concerns with header files:
Putting dependencies (e.g. ncurses.h) in your public header files may make it more difficult to include your header file in other projects
The transitive costs of header files will dominate compile time, so reducing unnecessary includes will allow your project to compile more quickly. Programs have been developed to manage includes.
Header files can destructively interfere with each other, for instance because of macros that change the semantics of subsequently included header files. windows.h is probably the most notorious culprit, and the risks can be difficult to quantify for large header files or large sets of header files.
Over time it can become obvious what header files should actually be bundled, e.g. when the convenience benefit is high and the risks and costs are low. On a small project perhaps it is obvious from the outset.
Preferably, you should include the headers in the files that are actually using them even if it might be a little redundant. That way if later you remove an include to a header you defined, you can avoid compilation issues if that file happened to use stdio.h functions but didn't include it for itself.
It's also more clear at a glance of the first few lines what the file is using.

Multiple Header Files and Function Prototypes in C

Assuming that I work on a big project in C with multiple .c files, is there any reason why I should prefer to have multiple header files instead of a single header file?
And another question:
Let's say that I have 3 files: header.h, main.c and other.c.
I have a function named func() that is defined and used only in the file other.c. Should I place the function prototype in the header file or in the file other.c ?
Multiple headers vs a single header.
A primary reason for using multiple headers is that some of the code may be usable independently of the rest, and that code should probably have its own header. In the extreme, each source file (or small group of source files) that provides a service should have its own header that defines the interface to the service.
Also note that what goes in the header is the information needed to use the module — function declarations and type declarations needed by the function declarations (you don't have global variables, do you?). The header should not include headers only needed by the implementation of the module. It should not define types only needed by the implementation of the module. It should not define functions that are not part of the formal interface of the module (functions used internally by the module).
All functions in a module that can be static should be static.
You might still have an omnibus header for your current project that includes all, or most, or the separate headers, but if you think of headers as defining the interfaces to modules, you will find that most consumer modules don't need to know about all possible provider modules.
The function func() is only used in other.c so the function should be made static so that it is only visible in other.c. It should not go in a header unless some other file uses the function — and at that point, it is crucial that it does go into a header.
You may find useful information in these other questions, and there are, no doubt, a lot of other questions that would help too:
What are extern variables in C?
Where to document functions in C?
Design principles — Best practices and design patterns for C
Should I use #include in headers?
If it's a BIG project, you almost certainly HAVE to have multiple headerfiles to make anything sensible out of your project.
I have worked on projects that have several thousand source files, and many hundred header files, totalling millions of lines. You couldn't put all those headerfiles together into one file, and do any meaningful work.
A headerfile should provide one "funcionality". So, if you have a program dealing with customer accounts, stock, invoices, and such, you may have one "customer.h", a "stock.h" and a "invoice.h". You'll probably also have a "dateutils.h" for calculating the "when does this invoice need to be paid by, and how long is it since the invoice was sent out, to send out reminders.
In general, keeping headerfiles SMALL is a good thing. If one headerfile needs something from another one, have it include that.
Of course, if a function is not used outside a particular file, it should not go in a headerfile, and to avoid "leaking names", it should be static. E.g:
static void func(int x)
{
return x * 2;
}
If, for some reason, you need to forward declare func (because some function before func needs to call func), then declare it at the beginning of the source file. There is no need to "spread it around" by adding it to a header file.
By marking it static, you are making it clear that "nobody else, outside this file, uses this function". If at a later stage, you find that "Hmm, this func is really useful in module B as well", then add it to a suitable header file (or make a new header file), and remove the static. Now, anyone reading the source file knows that they will need to check outside of this source file to make sure that any changes to func are OK in the rest of the code.
Commonly, there is a header file per module describing its interface for clean separation of concerns/readability/re-usability.
If the function in other.c is local, there is no need to include it in the header file.

Is it right to simply include all header files?

Remembering the names of system header files is a pain...
Is there a way to include all existing header files at once?
Why doesn't anyone do that?
Including unneeded header files is a very bad practice. The issue of slowing down compilation might or might not matter; the bigger issue is that it hides dependencies. The set of header files you include in a source file should is the documentation of what functionality the module depends upon, and unlike external documentation or comments, it is automatically checked for completeness by the compiler (failing to include needed header files will result in an error). Ensuring the absence of unwanted dependencies not only improves portability; it also helps you track down unneeded and potentially dangerous interactions, for instance cases where a module which should be purely computational or purely data structure management is accessing the filesystem.
These principles apply whether the headers are standard system headers or headers for modules within your own program or third-party libraries.
Your source code files are preprocessed before the compiler looks at them, and the #include statement is one of the directives that the preprocessor uses. When being preprocessed, #include statements are replaced with the entire contents of the file being included. The result of including all of the system files would be very large source files that the compiler then needs to work through, which will cost a lot of time during compilation.
No one includes all the header files. There are too many, and a few of them are mutually exclusive with other files (like ncurses.h and curses.h).
It really is not that bad when writing a program even from scratch. A few are quite easy to remember: stdio.h for any FILE stuff; ctype.h for any character classification, alloc.h for any use of malloc(), etc.
If you don't remember one:
leave the #include out
compile
examine first few error messages for indication of a missing header file, such as some type not declared, or calling a function with assumed parameter types
figure out which function call is the cause
look at the man page (or whatever documentation your compiler has) for that function
notice the #include shown by the documentation and add it
repeat until all errors fixed
It is quite a bit easier for adding to an existing code base. You could go hundreds or thousands of working hours and never have to add a #include.
No it is a terrible idea and will massively increase your compile times and possible make your exe a lot larger by including massive amounts of unused code.
I know what you're talking about, but I need to double-check the function prototypes for the functions I'm using (for ones I don't use daily, anyway) -- I'll just copy and paste the #includes straight out of the manpage for the associated functions. I'm already looking at the manpage (it's a simple K in vim(1)), so it doesn't feel like an extra burden.
You can create a "master" header, where you put all your includes into. Then in everything else include it! Beware of conflicting definitions and circular references... So.... Master1.h, master2.h, ...
Not advocating it. Just saying.

Where to put include statements, header or source?

Should I put the includes in the header file or the source file? If the header file contains the include statements, then if I include that header file in my source, then will my source file have all of the included files that were in my header? Or should I just include them in my source file only?
Only put includes in a header if the header itself needs them.
Examples:
Your function returns type size_t. Then #include <stddef.h> in the header file.
Your function uses strlen. Then #include <string.h> in the source file.
There's been quite a bit of disagreement about this over the years. At one time, it was traditional that a header only declare what was in whatever module it was related to, so many headers had specific requirements that you #include a certain set of headers (in a specific order). Some extremely traditional C programmers still follow this model (religiously, in at least some cases).
More recently, there's a movement toward making most headers standalone. If that header requires something else, the header itself handles that, ensuring that whatever it needs is included (in the correct order, if there are ordering issues). Personally, I prefer this -- especially when the order of headers can be important, it solves the problem once, instead of requiring everybody who uses it to solve the problem yet again.
Note that most headers should only contain declarations. This means adding an unnecessary header shouldn't (normally) have any effect on your final executable. The worst that happens is that it slows compilation a bit.
Your #includes should be of header files, and each file (source or header) should #include the header files it needs. Header files should #include the minimum header files necessary, and source files should also, though it's not as important for source files.
The source file will have the headers it #includes, and the headers they #include, and so on up to the maximum nesting depth. This is why you don't want superfluous #includes in header files: they can cause a source file to include a lot of header files it may not need, slowing compilation.
This means that it's entirely possible that header files might be included twice, and that can be a problem. The traditional method is to put "include guards" in header files, such as this for file foo.h:
#ifndef INCLUDE_FOO_H
#define INCLUDE_FOO_H
/* everything in header goes here */
#endif
The approach I have evolved into over twenty years is this;
Consider a library.
There are multiple C files, one internal H file and one external H file. The C files include the internal H file. The internal H file includes the external H file.
You see that from the compilers POV, as it compiles a C file, there is a hierarchy;
external -> internal -> C code
This is the correct ordering, since that which is external is everything a third party needs to use the library. That which is internal is required to compile the C code.
If header file A #includes header files B and C, then every source file that #includes A will also get B and C #included. The pre-processor literally just performs text substitution: anywhere it finds text that says #include <foo.h> it replaces it with the text of foo.h file.
There are different opinions on whether you should put #includes in headers or source files. Personally, I prefer to put all #includes in source file by default, but any header files that cannot compile without other pre-requisite headers should #include those headers themselves.
And every header file should contain an include guard to prevent it being included multiple times.
Make all of your files so that they can be built using only what they include. If you don't need an include in your header remove it. In a big project if you don't maintain this discipline you leave yourself open to breaking an entire build when someone removes an include from a header file that is being used by a consumer of that file and not even by the header.
In some environments, compilation will be fastest if one only includes the header files one needs. In other environments, compilation will be optimized if all source files can use the same primary collection of headers (some files may have additional headers beyond the common subset). Ideally, headers should be constructed so multiple #include operations will have no effect. It may be good to surround #include statements with checks for the file-to-be-included's include-guard, though that creates a dependency upon the format of that guard. Further, depending upon a system's file caching behavior, an unnecessary #include whose target ends up being completely #ifdef'ed away may not take long.
Another thing to consider is that if a function takes a pointer to a struct, one can write the prototype as
void foo(struct BAR_s *bar);
without a definition for BAR_s having to be in scope. A very handy approach for avoiding unnecessary includes.
PS--in many of my projects, there will be a file which it's expected that every module will #include, containing things like typedefs for integer sizes and a few common structures and unions [e.g.
typedef union {
unsigned long l;
unsigned short lw[2];
unsigned char lb[4];
} U_QUAD;
(Yes, I know I'd be in trouble if I moved to a big-endian architecture, but since my compiler doesn't allow anonymous structs in unions, using named identifiers for the bytes within the union would require that they be accessed as theUnion.b.b1 etc. which seems rather annoying.
You should only include files in your header that you need to declare constants and function declarations. Technically, these includes will also be included in your source file, but for clarity sake, you should only include in each file the files you actually need to use. You should also protect them in your header from multiple inclusion thusly:
#ifndef NAME_OF_HEADER_H
#define NAME_OF_HEADER_H
...definition of header file...
#endif
This prevents the header from being included multiple times, resulting in a compiler error.
Your source file will have the include statements if your put it in the header. However, in some cases it would be better to put them in the source file.
Remember that if you include that header in any other sources, they will also get the includes from the header, and that is not always desirable. You should only include stuff where it is used.

C project structure - header-per-module vs. one big header

I've worked with a number of C projects during my programming career and the header file structures usually fall into one of these two patterns:
One header file containing all function prototypes
One .h file for each .c file, containing prototypes for the functions defined in that module only.
The advantages of option 2 are obvious to me - it makes it cheaper to share the module between multiple projects and makes dependencies between modules easier to see.
But what are the advantages of option 1? It must have some advantages otherwise it would not be so popular.
This question would apply to C++ as well as C, but I have never seen #1 in a C++ project.
Placement of #defines, structs etc. also varies but for this question I would like to focus on function prototypes.
I think the prime motivation for #1 is ... laziness. People think it's either too hard to manage the dependencies that splitting things into separate files can make more obvious, and/or think it's somehow "overkill" to have separate files for everything.
It can also, of course, often be a case of "historical reasons", where the program or project grew from something small, and no-one took the time to refactor the header files.
Option 1 allows for having all the definitions in one place so that you have to include/search just one file instead of having to include/search many files. This advantage is more obvious if your system is shipped as a library to a third party - they don't care much about your library structure, they just want to be able to use it.
Another reason for using a different .h for every .c is compile time. If there is just one .h (or if there are more of them but you are including them all in every .c file), every time you make a change in the .h file, you will have to recompile every .c file. This, in a large project, can represent a valuable amount of time being lost, which can also break your workflow.
1 is just unnecessary. I can't see a good reason to do it, and plenty to avoid it.
Three rules for following #2 and have no problems:
start EVERY header file with a
#ifndef _HEADER_Namefile
#define _HEADER_Namefile_
end the file with
#endif
That will allow you to include the same header file multiple times on the same module (innadvertely may happen) without causing any fuss.
you can't have definitions on your header files... and that's something everybody thinks he/she knows, about function prototypes, but almost ever ignores for global variables.
If you want a global variable, which by definition should be visible outside it's defining C module, use the extern keyword:
extern unsigned long G_BEER_COUNTER;
which instructs the compiler that the G_BEER_COUNTER symbol is actually an unsigned long (so, works like a declaration), that on some other module will have it's proper definition/initialization. (This also allows the linker to keep the resolved/unresolved symbol table.) The actual definition (same statement without extern) goes in the module .c file.
only on proven absolute necessity do you include other headers within a header file. include statements should only be visible on .c files (the modules). That allows you to better interpret the dependecies, and find/resolve issues.
I would recommend a hybrid approach: making a separate header for each component of the program which could conceivably be used independently, then making a project header that includes all of them. That way, each source file only needs to include one header (no need to go updating all your source files if you refactor components), but you keep a logical organization to your declarations and make it easy to reuse your code.
There is also I believe a 3rd option: each .c has its own .h, but there is also one .h which includes all other .h files. This brings the best of both worlds at the expense of keeping a .h up to date, though that could done automatically.
With this option, internally you use the individual .h files, but a 3rd party can just include the all-encompassing .h file.
When you have a very large project with hundreds/thousands of small header files, dependency checking and compilation can significantly slow down as lots of small files must be opened and read. This issue can be often solved by using precompiled headers.
In C++ you would definitely want one header file per class and use pre-compiled headers as mentioned above.
One header file for an entire project is unworkable unless the project is extremely small - like a school assignment
That depends on how much functionality is in one header/source file. If you need to include 10 files just to, say, sort something, it's bad.
For example, if I want to use STL vectors I just include and I don't care what internals are necessary for vector to be used. GCC's includes 8 other headers -- allocator, algobase, construct, uninitialized, vector and bvector. It would be painful to include all those 8 just to use vector, would you agree?
BUT library internal headers should be as sparse as possible. Compilers are happier if they don't include unnecessary stuff.

Resources