I'm going rounds with the compiler right now, and I want to make sure the problem isn't a fundamental misunderstanding of how header files. If I include a header file, and that header file has includes in it (like <stdbool.h> or <stdio.h>, etc...), there shouldn't be any issue on the dependent C file, right? The preprocessor should just insert the assembled code accordingly when I call my makefile, by my understanding. Am I mistaken?
To reiterate:
Say I have a main.c with prototype.h.
prototype.h has in it all of my usual includes for libraries and what-not.
I have a couple other C files (secondary.c and tertiary.c, for instance), both of which need the usual libraries, as well, and may or may not need some of the prototypes and also have their own header files.
All I'd need to do is include prototype.h in each of the C files, correct?
Also, in that instance, if I were making .o file using the -c flag of gcc in my makefile, would I need to update the dependency in the target of the makefile?
I thought I had a solid handle on this when I started, but now I'm thoroughly confused.
That is questions with multiple answers.
In simple case you only include headers if they are used in .c file. Headers are dependencies for compilation so for Make you just put it in the same bucket where .c files using those.
For big projects to speedup compilation you may use what called 'precompiled header'. You must include the same header file into every source file, compiler will only process that header once. That is very useful when headers are complicated (e.g. boost library)
Related
I am new to C and am just learning the basics of modularising my code for neatness and maintainability. I am reading a lot of people saying not to include .c files directly but instead to use .h files with associated .c files.
My question is, when writing a library which is exposed/included via its .h file - does the compiler dedupe common includes or are the included each time they are referenced?
For instance in my above application, I am using printf in my main and also in my foo library.
When running:
gcc -o app main foo/foo.c && ./app
I get the expected outputs printed to the console, however does the compiler remove duplicates of the <stdio.h> include or is it included once for main.c and once again for foo.c?
No, the compiler does not remove them. Nor should it, because sometimes (although it's rare) headers are written with the purpose of being included several times with different effects each time. So the compiler can't just omit these subsequent inclusions.
That's why people put include guards in headers (#ifndef FOO_H_ in this case.)
Each file, regardless of whether is a .h or .c file, should include what it needs. It should not rely that a header has already been included somewhere else. If something is included twice in the current compilation unit, the include guards will make sure headers are only included once, regardless of how many files try to include them.
As a side note, #pragma once, even though it's not in the C standard, is a de-facto standard compiler extension. So you can use just do:
#pragma once
void foo();
It's one of those rare cases where a non-standard compiler extension is so widely supported that it's safe to use.
In contrary, each compilation unit ("main.c" and "foo.c" in your case) needs that include. Otherwise the compiler would not know the prototype of printf()(note). Each compilation unit (aka "module") is compiled on its own.
You might mix up headers and linkable files (object code files, and libraries).
The contents of a header file replaces the #include line during preprocessing. "stdio.h" contains only the prototype of printf(), among a lot of other stuff, not the implementation of the function.
If the compiler generates the object code for "main.c" and "foo.c", each of them includes an unresolved reference to printf().
Finally the linker will include the object code for printf(), but just once. This single instance of the function is called by both callers. Here happens what you seem to ask.
You might wonder why you don't have to add the library to your command line. This is a convenience feature of most compiler drivers, as nearly all applications want the standard libraries. You might like to add "-v" to the command line to see what really happens. Other options can suppress this automation.
Note: Some compilers are quite smart and know a lot of standard functions. They will accept the source and produce a nice warning. But don't rely on this.
Sometimes I see someone compile a C program like this:
gcc -o hello hello.c hello.h
As I know, we just need to put the header files into the C program like:
#include "somefile"
and compile the C program: gcc -o hello hello.c.
When do we need to compile the header files or why?
Firstly, in general:
If these .h files are indeed typical C-style header files (as opposed to being something completely different that just happens to be named with .h extension), then no, there's no reason to "compile" these header files independently. Header files are intended to be included into implementation files, not fed to the compiler as independent translation units.
Since a typical header file usually contains only declarations that can be safely repeated in each translation unit, it is perfectly expected that "compiling" a header file will have no harmful consequences. But at the same time it will not achieve anything useful.
Basically, compiling hello.h as a standalone translation unit equivalent to creating a degenerate dummy.c file consisting only of #include "hello.h" directive, and feeding that dummy.c file to the compiler. It will compile, but it will serve no meaningful purpose.
Secondly, specifically for GCC:
Many compilers will treat files differently depending on the file name extension. GCC has special treatment for files with .h extension when they are supplied to the compiler as command-line arguments. Instead of treating it as a regular translation unit, GCC creates a precompiled header file for that .h file.
You can read about it here: http://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
So, this is the reason you might see .h files being fed directly to GCC.
Okay, let's understand the difference between active and passive code.
The active code is the implementation of functions, procedures, methods, i.e. the pieces of code that should be compiled to executable machine code. We store it in .c files and sure we need to compile it.
The passive code is not being execute itself, but it needed to explain the different modules how to communicate with each other. Usually, .h files contains only prototypes (function headers), structures.
An exception are macros, that formally can contain an active pieces, but you should understand that they are using at the very early stage of building (preprocessing) with simple substitution. At the compile time macros already are substituted to your .c file.
Another exception are C++ templates, that should be implemented in .h files. But here is the story similar to macros: they are substituted on the early stage (instantiation) and formally, each other instantiation is another type.
In conclusion, I think, if the modules formed properly, we should never compile the header files.
When we include the header file like this: #include <header.h> or #include "header.h" then your preprocessor takes it as an input and includes the entire file in the source code. the preprocessor replaces the #include directive by the contents of the specified file.
You can check this by -E flag to GCC, which generates the .i (information file) temporary file or can use the cpp(LINUX) module specifically which is automatically used by the compiler driver when we execute GCC.
So its actually going to compile along with your source code, no need to compile it.
In some systems, attempts to speed up the assembly of fully resolved '.c' files call the pre-assembly of include files "compiling header files". However, it is an optimization technique that is not necessary for actual C development.
Such a technique basically computed the include statements and kept a cache of the flattened includes. Normally the C toolchain will cut-and-paste in the included files recursively, and then pass the entire item off to the compiler. With a pre-compiled header cache, the tool chain will check to see if any of the inputs (defines, headers, etc) have changed. If not, then it will provide the already flattened text file snippets to the compiler.
Such systems were intended to speed up development; however, many such systems were quite brittle. As computers sped up, and source code management techniques changed, fewer of the header pre-compilers are actually used in the common project.
Until you actually need compilation optimization, I highly recommend you avoid pre-compiling headers.
I think we do need preprocess(maybe NOT call the compile) the head file. Because from my understanding, during the compile stage, the head file should be included in c file. For example, in test.h we have
typedef enum{
a,
b,
c
}test_t
and in test.c we have
void foo()
{
test_t test;
...
}
during the compile, i think the compiler will put the code in head file and c file together and code in head file will be pre-processed and substitute the code in c file. Meanwhile, we'd better to define the include path in makefile.
You don't need to compile header files. It doesn't actually do anything, so there's no point in trying to run it. However, it is a great way to check for typos and mistakes and bugs, so it'll be easier later.
When I compile a C program, for ease I've been including the source file for a certain header at the end. So, if main.c includes util.h, util.h will have all the headers util.c will use, outlines types or structs, etc, then at the very end it include util.c. Then, when I compile I only have to use gcc main.c -o main, and the rest is all taken care of.
I've been looking up C coding standards, trying to figure out what the best way to do things is, and there are just so many, and so many conflicting opinions I don't know what to think. Why do so many places reccomend compiling object files individually instead of including all of them in a web? util never touches anything but util.c, so the two are perfectly independent, and in theory (my theory) it would be fine, but I'm probably wrong since this is computer science and people are wrong even when they're right, so if I'm already wrong I'm probably wrong.
Some people say header files should ONLY be prototypes, and the source file be the one that includes it, and it's necessary system headers. From purely as aesthetic point of view I much prefer having all the info (types, system headers used, prototypes) in the header (in this case util.h) and having ONLY function code in util.c (excluding one "#include "util.h"" at the very top).
I guess the point I'm getting at is, with all this stuff that works, selecting a method sounds arbitrary to someone who doesn't understand the background (me). Please tell me why and what.
While your program is small, this will work. At some point, however, your program will get large enough that recompiling the whole program every time you change one line is a pain in the rear.
This -- even more than avoiding editing huge files -- is the reason to split up your program. If main.c and util.c are seperately compiled into object files, changing one line in a function in main.c will no longer require you to recompile all the code in util.c.
By the time your program is made up of a few dozen files, this will be a big win.
I think the point is that you want to include only what is needed for that file to be independent. This reduces overall compilation times by allowing the compiler to only read the headers that are necessary rather repeatedly reading every header when it might not need to. For example, if your util.c method utilises functions and/or types in <stdio.h> but your util.h doesn't, then you would want to include <stdio.h> only in util.c so that when the compiler compiles util.c it only then includes <stdio.h>, but if you include <stdio.h> in your util.h instead, then every source file that includes util.h is also including <stdio.h> whether it needs it or not.
This is very negligible for small projects with only a handful of files, but proper header inclusion can affect compilation times for larger projects.
With regards to the question about "object files": when you compile a source file into an object file, you create a shortcut that allows a build system to only recompile the source files that have outdated object files. This is an effective way to significantly reduce compilation times especially for large projects.
First, including a .c file from a .h file is completely bass-ackwards.
The "standard" way of doing it follows a line of thought roughly like this:
You have a library, containing dozens of functions. Keeping everything in one big source file means that anyone using your library would have to link the whole library, even if he uses only a single function of it. (Imagine linking the whole C standard library for a puts( "Hello" ).)
So you split things across multiple source files, which are compiled individually. Whenever you make changes to one of your functions, you have to re-translate only one small source file and update the library archive (or executable) - instead of re-translating the whole thing every time. (This is still an issue, because code sizes have somewhat kept up with CPU improvements. Compiling something like the Boost lib can still take several minutes on not-too-fancy hardware...)
Now you are in a pinch, however. The function is defined inside the .c file, and the corresponding .o file can conveniently be linked (via a .a archive if need be). However, to actually address the function (provided by the .o file) properly from another source file (a.k.a. "translation unit"), your compiler needs to know the function name, its parameter list, and its return type. This is why the declaration of the function (i.e., the function head without its body) is put in a separate header (.h) file.
Other source files can now #include the header file, address the function properly (without the compiler being aware of what the function actually does), and when all parts of your library / program are compiled into .o files, then everything is linked together.
The source file includes its own header basically to make sure the two files agree on the function declaration. ;-)
That's about it, as far as I can be bothered to write it up right now. Putting everything into one monolithic source file is barely acceptable (actually, no, it isn't, not for anything beyond about 200 lines), but including the .c file at the end of the .h file either means you learned your C coding by looking at god-awful code instead of a good book, or whoever tutored you should never tutor another person on C coding in his life. No offense intended. ;-)
PS: Header files also provide a good summary / oversight of a piece of code. Languages that don't provide headers - Java, for example - need IDE's or documentation tools to extract this kind of information. Personally, I found header files to be a benefit, not a liability.
Please use *.h and *.c files as customary: *.h files are #included in *.c files; *.h contain only macro definitions, data type declarations, function declarations, and extern data declarations. All definitions are in *.c files. That is how everybody else organizes C programs, do your fellow humans (who some day might need to understand your program) a favor. If something in file.c is used outside, you'd write file.h containing the declarations of whatever in that file is to be used outside, and include that in file.c (to check that declarations and definitions agree) and in all using *.c files. If a bunch of *.h are always included together, it might mean that the splitup into *.c isn't right (or at least that of the *.h; perhaps you should make one .h including all those declarations, and creating *.h for internal use where needed among the group of related *.c files).
[If a program written as you outline crosses my path, I can assure you I'll avoid it like the plague. The extra obfuscation might be wellcome in IOCCC, but not by me. It is a sure sign of somebody who doesn't know how to organize a program cleanly, and so the program probably isn't worth trying it out.]
Re: Separate compilation: You break up a C program so the pieces are easier to understand, you can hide details of how things work in the C files (think static), this provides support for Parnas' modularity. It also means that if you change a file, you don't have to recompile everything.
Re: Differing C programming standards: Yes, there are lots of them around. Pick one you feel confortable with, and stick to that. If you work on a project, adhere to their standards.
The "include in a single translation unit" approach becomes very inefficient for any significantly sized project, it is impractical for projects that are distributed amongst multiple developers.
Morover when creating static libraries, if everything in the library were from a single translation unit, any code linked to it would get all the library code regardless of whether it is referenced or not.
A project using a build manager such as make or the features available in most IDEs uses header file dependencies to allow an incremental build; only compiling those sources that are modified or dependent on modified files. The dependencies are determined by the file inclusions, so minimising redundant dependencies speeds build time.
A typical commercial project can comprise hundreds of thousands of lines of code and a few hundred source files; full rebuild times can vary from minutes to hours. If in your development cycle you have to wait that long between code changes and test, productivity would be very low!
So currently in my programming I now have a fairly large range of functions I have created and stored in separate C files that I use quite frequently from project to project.
My question is what is the simplest, most effective way to implement them into other projects? Currently I just make a header file for each new project that has the function prototypes for all the custom functions I want to use.
I then have every C file in the project include this "master" header. In this header I also include header files that each C file utilizes, so every C file has one header; let's just call it master.h.
I feel like I am doing this completely wrong. Should I be making header files for each C file and including them into a master header file? or should I just create header files per C file and include them as needed? If I do that how will everything still be linked together?
What is the best way to go about using header files in projects?
Do not have a header file including other header files. Let the .c file do that - makes compilation quicker.
Use forward declarations. Makes recompilation quicker as it does not need to open up other files and if any simple change the make command will spend ages compiling lots of stuff.
Group functions together in both a header file and the corresponding .c file if they logically fit together. For static libraries the linker picks out the appropriate bits. For dynamic libraries they are loaded at run time (hence can be used by other binaries) if not currently in memory.
Do not have a master.h. Just make libraries contain functions that are related (e.g. math function, input/output functions etc). The various projects can pick 'n' chose what they require.
I recommend against having a master.h file, as your whole project ends up being too coupled. Every time you change master.h all your source files need to be recompiled, which significantly slows down the build process. Instead, try to create .h files with related functions. The C library is a good inspiration for this: stdlib.h, math.h etc.
In order to use these headers, you should include them in each .c, just like you would with the standard C headers:
#include <math.h>
#include <stdio.h>
As for linking, this is a totally different subject. If most of your functions are inlined, you do not have to worry about linking. Otherwise, you should define them in .c files which are named similarly to your headers (e.g., utils.c for utils.h etc.)
Create a header file for all the .c files. Group the similar functions in a .c file. Dont forget to add header guard in each header files.
For example consider a header file one.h, it should contain the below header guards.
#ifndef ONE_H
#define ONE_H
//give your function prototypes here.
#endif //ONE_H
Header guard will be useful to avoid double includes.
I have two files foo.c and bar.c that I compile separately with gcc -c and then link. Both files need the stdio.h and stdlib.h headers.
Do I have to include them in both? Doesn't feel a little redundant? Should I maybe use #ifdef?
What's the best practice?
Each C file is a different translation unit. In other words, it is an entire separate program, syntactically complete and correct. Thus, each C file must compile independently of any other C file, and must contain every declaration for every identifier it uses, regardless of whether these declarations also appear in other C files. From the compiler point of view, each C file is a complete program by itself (albeit with unresolved references).
Header files are, by convention, files that contains declarations that must appear in a group of C files. Header files can be included by the preprocessor -- which is a simple textual copy-and-paste at the include point -- as a convenience to avoid manually duplicating declarations between the translation units.
Summing up: it is not redundant to include the same files in different C files -- it is formally required.
(Afterwards, you link object files, which are just smaller programs, into a larger final program. The larger program is roughly a summation of smaller subprograms, with all references resolved between them. Generally speaking, the linking phase does not know anything about the language structure of the original files that generated the resulting object file.)
You must include your header with both files. The header is just a list of function declarations and constant declarations. The compiler needs these to make sure your syntax and function calls are correct.
Yes you have to include them both. I mostly work in C++ and I try to include files in my .cpp files whenever possible and in my .h files only if I must. This often means doing some type of forward reference in the .h file.
Some platforms I've worked on create a .h that includes a bunch of standard files so that the you can just include the one file. Personally, I've never particularly liked this approach.
Every .h file you write should have what are known as "include guards". These are #ifdef statements that prevent a header from getting included more than once. You;ll notice the ones you mentioned already have include guards in place. But make sure you put them in your foo.h and bar.h too.
The header file should (and will) have a #ifndef wrapper to prevent problems with this (the problem is actually with cyclic inclusion - you include -> foo.h includes bar.h bar.h includes foo.h).
In your case the "duplicate" inclusion is actually needed (for function prototypes, structure definitions) since (as you said) both files are compiled separately. The compile process of foo.c doesn't know anything about the compilation of bar.c.
You will have to include the header in any file which is compiled by itself even if you will be linking it later.
There is already an #ifdef in the header so that it will only be actually used once.