What exactly does the preprocessor do when it encounters an #include directive in a source code?
I assume it replaces the #include with the contents of the included file, but I wanted something stronger than my assumption.
Is there any reason not to type the contents of the included file straight into the source code rather than #include it other than it being nicer on the eye?
The preprocessor will replace the #include statement with the contents of the file.
The advantage of using #include instead of simply pasting the content of the file is that, if the header file is modified, all you have to do is recompile the source file. Had you pasted the content of the file then you would have had to replace that with the new version of the header file.
Also, if you #include a file in several places (as happens with constants and type definition files) you don't have to modify all repeated declarations, the multiple times included file makes one place of change instead of several.
From my copy of the draft C11 standard section 6.10.2 paragraph 3
A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the
specified sequence between the " delimiters.
That's the point of #include files.
Take #include <stdio.h> which is a distributed library header. If you have a suite of project source files you would possibly have to paste the content of stdio.h into every one. The whole point of #include files is that you don't have to.
Now take #include "mycommon.h" which is your local commonly used project functions. If you don't use #include every time you modify your local mycommon.h you will have to re-paste it into all your source files.
Is there any reason not to type the contents of the included file straight into the source code rather than #include it other than it being nicer on the eye?
It is not a matter of being easy on the eye.
The primary purpose of a header file is to have common declarations for function-prototypes and data-objects that are defined in separate translation units. Most non-trivial applications comprise multiple modules in separate translation units separately compiled and linked. A function or data-object must be defined in only one translation unit, but may be referenced in many - all of which must have a correctly matching declaration for the link to succeed. The simplest and least error prone method of ensuring the declarations are correct in every translation unit is to use a header file; entering the same information in multiple translation units would be very difficult to maintain.
If on the other hand your translation unit contains functions and data that are accessed only within that translation unit (or your application is a single translation unit), then the corresponding declarations may indeed appear in the same source file, and should also be declared static to explicitly disallow external linkage.
Consider for example the standard library headers such as stdio.h. You could for example enter the prototype for printf() dirctly in your code - it might look like:
extern int printf ( const char * format, ... );
but you would have to get it exactly right every time, and do it for every function you wished to use. Would you really do that!?
Related
I would like to write a C library with fast access by including just header files without using compiled library. For that I have included my code directly in my header file.
The header file contains:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#ifndef INC_TEST_H_
#define INC_TEST_H_
void test(){
printf("hello\n");
}
#endif
My program doesn't compile because I have multiple reference to function test(). If I had a correct source file with my header it works without error.
Is it possible to use only header file by including code inside in a C app?
Including code in a header is generally a really bad idea.
If you have file1.c and file2.c, and in each of them you include your coded.h, then at the link part of the compilation, there will be 2 test functions with global scope (one in file1.c and the other one in file2.c).
You can use the word "static" in order to say that the function will be restricted so it is only visible in the .c file which includes coded.h, but then again, it's a bad idea.
Last but not least: how do you intend to make a library without a .so/.a file? This is not a library; this is copy/paste code directly in your project.
And when a bug is found in your "library", you will be left with no solution apart correcting your code, redispatch it in every project, and recompile every project, missing the very point of a dynamic library: The ability to "just" correct the library without touching every program using it.
If I understand what you're asking correctly, you want to create a "library" which is strictly source code that gets #incuded as necessary, rather than compiled separately and linked.
As you have discovered, this is not easy when you're dealing with functions - the compiler complains of multiple definitions (you will have the same problem with object definitions).
You have a couple of options at this point.
You could declare the function static:
static void test( void )
{
...
}
The static keyword limits the function's visibility to the current translation unit, so you don't run into multiple definition errors at link time. It means that each translation unit is creating its own separate "instance" of the function, leading to a bit of code bloat and slightly longer build times. If you can live with that, this is the easiest solution.
You could use a macro in place of a function:
#define TEST() (printf( "hello\n" ))
except that macros are not functions and do not behave like functions. While macro-based "libraries" do exist, they are not trivial to implement correctly and require quite a bit of thought. Remember that macro arguments are not evaluated, they're just expanded in place, which can lead to problems if you pass expressions with side effects. The classic example is:
#define SQUARE(x) ((x)*(x))
...
y = SQUARE(z++);
SQUARE(z++) expands to ((z++)*(z++)), which leads to undefined behavior.
Separate compilation is a Good Thing, and you should not try to avoid it. Doing everything in one source file is not scalable, and leads to maintenance headaches.
My program do not compiled because I have multiple reference to test() function
That is because the .h file with the function is included and compiled in multiple C source files. As a result, the linker encounters the function with global scope multiple times.
You could have defined the function as static, which means it will have scope only for the curent compilation unit, so:
static void test()
{
printf("hello\n");
}
I would like to know if it's possible that inside the main() function from C to include something.
For instance, in a Cell program i define the parameters for cache-api.h that later in the main() function i want to change .
I understood that what was defined with #define can be undefined with #undef anywhere in the program, but after redefining my needed parameters I have to include cache-api.h again . Is that possible?
How can I solve this problem more elegant ? Supposing I want to read from the main storage with cache_rd(...) but the types would differ during the execution of a SPU, how can i use both #define CACHED_TYPE struct x and #define CACHED_TYPE struct y in the same program?
Thanks in advance for the answer, i hope i am clear in expression.
#define and #include are just textual operations that take place during the 'preprocessing' phase of compilation, which is technically an optional phase. So you can mix and match them in all sorts of ways and as long as your preprocessor syntax is correct it will work.
However if you do redefine macros with #undef your code will be hard to follow because the same text could have different meanings in different places in the code.
For custom types typedef is much preferred where possible because you can still benefit from the type checking mechanism of the compiler and it is less error-prone because it is much less likely than #define macros to have unexpected side-effects on surrounding code.
Yes, that's fine (may not be the clearest design decision) but a #include is just like a copy-and-paste of that file into the code right where the #include is.
#define and #include are pre-processor macros: http://en.wikipedia.org/wiki/C_preprocessor
They are converted / inlined before compilation.
To answer your question ... no, you really wouldn't want do do that, at least for the sake of the next guy that has to try and unscramble that mess.
You can #include any file in any file. Whether it is then valid depends on the content of the file; specifically whether that content would be valid if it were entered directly as text.
Header files generally contain declarations and constructs that are normally only valid outside of a function definition (or outside any kind of encoding construct) - the clue is in the name header file. Otherwise you may change the scope of the declarations, or more likley render the compilation unit syntactically invalid.
An include file written specially for the purpose may be fine, but not just any arbitrary header file.
General purpose header files should have include guards to prevent multiple declaration, so unless you undefine the guard macro, re-including a header file will have no effect in any case.
One possible solution to your problem is to create separately compiled modules (compilation units) containing wrapper functions to the API you need to call. Each compilation unit can then include the API header file after defining the appropriate configuration macros. You will then have two separate and independent interfaces provided by these wrapper functions.
In C (or a language based on C), one can happily use this statement:
#include "hello.h";
And voila, every function and variable in hello.h is automagically usable.
But what does it actually do? I looked through compiler docs and tutorials and spent some time searching online, but the only impression I could form about the magical #include command is that it "copy pastes" the contents of hello.h instead of that line. There's gotta be more than that.
Logically, that copy/paste is exactly what happens. I'm afraid there isn't any more to it. You don't need the ;, though.
Your specific example is covered by the spec, section 6.10.2 Source file inclusion, paragraph 3:
A preprocessing directive of the form
# include "q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source file identified by the specified sequence between the " delimiters.
That (copy/paste) is exactly what #include "header.h" does.
Note that it will be different for #include <header.h> or when the compiler can't find the file "header.h" and it tries to #include <header.h> instead.
Not really, no. The compiler saves the original file descriptor on a stack and opens the #included file; when it reaches the end of that file, it closes it and pops back to the original file descriptor. That way, it can nest #included files almost arbitrarily.
The # include statement "grabs the attention" of the pre-processor (the process that occurs before your program is actually compiled) and "tells" the pre-processor to include whatever follows the # include statement.
While the pre-processor can be told to do quite a bit, in this instance it's being asked to recognize a header file (which is denoted with a .h following the name of that header, indicating that it's a header).
Now, a header is a file containing C declarations and definitions of functions not explicitly defined in your code. What does this mean? Well, if you want to use a function or define a special type of variable, and you know that these functions/definition are defined elsewhere (say, the standard library), you can just include (# include) the header that you know contains what you need. Otherwise, every time you wanted to use a print function (like in your case), you'd have to recreate the print function.
If its not explicitly defined in your code and you don't #include the header file with the function you're using, your compiler will complain saying something like: "Hey! I don't see where this function is defined, so I don't know what to with this undefined function in your code!".
It's part of the preprocessor. Have a look at http://en.wikipedia.org/wiki/C_preprocessor#Including_files. And yes, it's just copy and paste.
This is a nice link to answer this question.
http://msdn.microsoft.com/en-us/library/36k2cdd4.aspx
Usually #include and #include "path-name" just differs in the order of the search of the pre processor
I am beginning to question the usefulness of "extern" keyword which is used to access variables/functions in other modules(in other files). Aren't we doing the same thing when we are using #include preprocessor to import a header file with variables/functions prototypes or function/variables definitions?
extern is needed because it declares that the symbol exists and is of a certain type, and does not allocate storage for it.
If you do:
int foo;
In a header file that is shared between several source files, you will get a linker error because each source would have its own copy of foo created and the linker will be unable to resolve the symbol.
Instead, if you have:
extern int foo;
In the header, it would declare a symbol that is defined elsewhere in each source file.
One (and only one) source file would contain
int foo;
which creates a single instance of foo for the linker to resolve.
No. The #include is a preprocessor command that says "put all of the text from this other file right here". So, all of the functions and variables in the included file are defined in the current file.
The #include preprocessor directive simply copy/pastes the text of the included file into the current position in the current file.
extern marks that a variable or function exists externally to this source file. This is done by the originator ("I am making this data available externally"), and by the recipient ("I am marking that there is external data I need"). A recipient with an unsatisfied extern will cause an Undefined Symbol error.
Which to use? I prefer using #include with the include guard pattern:
#ifndef HEADER_NAME_H
#define HEADER_NAME_H
<write your header code here>
#endif
This pattern allows you to cleanly separate anything you want an outsider to have access to into the header, without worrying about a double-include error. Any time I have to open a .c file to find what externs are available, the lack of a clear interface makes my soul gem crack.
There are indeed two ways of using functions/variables across translation units (a translation unit is usually a *.c/*.cc file).
One is the forward declaration:
Declare functions/variables using extern in the calling file. extern is actually optional for functions (functions are automatically extern), but not for variables.
Implement the function/variables in the implementing file.
The other is using header files:
Declare functions/variables using extern in a header file (*.h/*.hh). Still, extern is optional for functions, but not for variables. So you don't normally see extern before functions in header files.
In the calling *.c/*.cc file, #include the header, and call the function/variable as needed.
In the implementing *.c/*.cc file, #include the header, and implement the function/variable.
Google C++ style guide has some good discussions on the pros and cons of the two approaches.
Personally, I would prefer the header file approach, as it is the single place (the header file) a function signature is defined, calling and implementation all adhere to this one piece of definition. Thus, there would be no unnecessary discrepancies that might occur in the forward declaration approach.
Is it necessary to #include some file, if inside a header (*.h), types defined in this file are used?
For instance, if I use GLib and wish to use the gchar basic type in a structure defined in my header, is it necessary to do a #include <glib.h>, knowing that I already have it in my *.c file?
If yes do I also have to put it between the #ifndef and #define or after the #define?
NASA's Goddard Space Flight Center (GSFC) rules for headers in C state that it must be possible to include a header in a source file as the only header, and that code using the facilities provided by that header will then compile.
This means that the header must be self-contained, idempotent and minimal:
self-contained — all necessary types are defined by including relevant headers if need be.
idempotent — compilations don't break even if it is included multiple times.
minimal — it doesn't define anything that is not needed by code that uses the header to access the facilities defined by the header.
The benefit of this rule is that if someone needs to use the header, they do not have to struggle to work out which other headers must also be included — they know that the header provides everything necessary.
The possible downside is that some headers might be included many times; that is why the multiple inclusion header guards are crucial (and why compilers try to avoid reincluding headers whenever possible).
Implementation
This rule means that if the header uses a type — such as 'FILE *' or 'size_t' - then it must ensure that the appropriate other header (<stdio.h> or <stddef.h> for example) should be included. A corollary, often forgotten, is that the header should not include any other header that is not needed by the user of the package in order to use the package. The header should be minimal, in other words.
Further, the GSFC rules provide a simple technique to ensure that this is what happens:
In the source file that defines the functionality, the header must be the first header listed.
Hence, suppose we have a Magic Sort.
magicsort.h
#ifndef MAGICSORT_H_INCLUDED
#define MAGICSORT_H_INCLUDED
#include <stddef.h>
typedef int (*Comparator)(const void *, const void *);
extern void magicsort(void *array, size_t number, size_t size, Comparator cmp);
#endif /* MAGICSORT_H_INCLUDED */
magicsort.c
#include <magicsort.h>
void magicsort(void *array, size_t number, size_t size, Comparator cmp)
{
...body of sort...
}
Note that the header must include some standard header that defines size_t; the smallest standard header that does so is <stddef.h>, though several others also do so (<stdio.h>, <stdlib.h>, <string.h>, possibly a few others).
Also, as mentioned before, if the implementation file needs some other headers, so be it, and it is entirely normal for some extra headers to be necessary. But the implementation file ('magicsort.c') should include them itself, and not rely on its header to include them. The header should only include what users of the software need; not what the implementers need.
Configuration headers
If your code uses a configuration header (GNU Autoconf and the generated 'config.h', for example), you may need to use this in 'magicsort.c':
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif /* HAVE_CONFIG_H */
#include "magicsort.h"
...
This is the only time I know of that the module's private header is not the very first header in the implementation file. However, the conditional inclusion of 'config.h' should probably be in 'magicsort.h' itself.
Update 2011-05-01
The URL linked above is no longer functional (404). You can find the C++ standard (582-2003-004) at EverySpec.com; the C standard (582-2000-005) seems to be missing in action.
The guidelines from the C standard were:
§2.1 UNITS
(1) Code shall be structured as units, or as stand-alone header files.
(2) A unit shall consist of a single header file (.h) and one or more body (.c) files. Collectively the header and body files are referred to as the source files.
(3) A unit header file shall contain all pertinent information required by a client unit. A unit’s
client needs to access only the header file in order to use the unit.
(4) The unit header file shall contain #include statements for all other headers required by the unit header. This lets clients use a unit by including a single header file.
(5) The unit body file shall contain an #include statement for the unit header, before all other #include statements. This lets the compiler verify that all required #include statements are in
the header file.
(6) A body file shall contain only functions associated with one unit. One body file may not
provide implementations for functions declared in different headers.
(7) All client units that use any part of a given unit U shall include the header file for unit U; this
ensures that there is only one place where the entities in unit U are defined. Client units may
call only the functions defined in the unit header; they may not call functions defined in the
body but not declared in the header. Client units may not access variables declared in the body
but not in the header.
A component contains one or more units. For example, a math library is a component that contains
multiple units such as vector, matrix, and quaternion.
Stand-alone header files do not have associated bodies; for example, a common types header does
not declare functions, so it needs no body.
Some reasons for having multiple body files for a unit:
Part of the body code is hardware or operating system dependent, but the rest is common.
The files are too large.
The unit is a common utility package, and some projects will only use a few of the
functions. Putting each function in a separate file allows the linker to exclude the ones not
used from the final image.
§2.1.1 Header include rationale
This standard requires a unit’s header to contain #include statements for all other headers required
by the unit header. Placing #include for the unit header first in the unit body allows the compiler to
verify that the header contains all required #include statements.
An alternate design, not permitted by this standard, allows no #include statements in headers; all
#includes are done in the body files. Unit header files then must contain #ifdef statements that check
that the required headers are included in the proper order.
One advantage of the alternate design is that the #include list in the body file is exactly the
dependency list needed in a makefile, and this list is checked by the compiler. With the standard
design, a tool must be used to generate the dependency list. However, all of the branch
recommended development environments provide such a tool.
A major disadvantage of the alternate design is that if a unit’s required header list changes, each file
that uses that unit must be edited to update the #include statement list. Also, the required header list
for a compiler library unit may be different on different targets.
Another disadvantage of the alternate design is that compiler library header files, and other third party
files, must be modified to add the required #ifdef statements.
A different common practice is to include all system header files before any project header files, in
body files. This standard does not follow this practice, because some project header files may
depend on system header files, either because they use the definitions in the system header, or
because they want to override a system definition. Such project header files should contain #include
statements for the system headers; if the body includes them first, the compiler does not check this.
GSFC Standard available via Internet Archive 2012-12-10
Information courtesy Eric S. Bullington:
The referenced NASA C coding standard can be accessed and downloaded via the Internet archive:
http://web.archive.org/web/20090412090730/http://software.gsfc.nasa.gov/assetsbytype.cfm?TypeAsset=Standard
Sequencing
The question also asks:
If yes, do I also have to put it (the #include lines) between the #ifndef and #define or after the #define.
The answer shows the correct mechanism — the nested includes, etc, should be after the #define (and the #define should be the second non-comment line in the header) — but it doesn't explain why that's correct.
Consider what happens if you place the #include between the #ifndef and #define. Suppose the other header itself includes various headers, perhaps even #include "magicsort.h" indirectly. If the second inclusion of magicsort.h occurs before #define MAGICSORT_H_INCLUDED, then the header will be included a second time before the types it defines are defined. So, in C89 and C99, any typedef type name will be erroneously redefined (C2011 allows them to be redefined to the same type), and you will get the overhead of processing the file multiple times, defeating the purpose of the header guard in the first place. This is also why the #define is the second line and is not written just before the #endif. The formula given is reliable:
#ifndef HEADERGUARDMACRO
#define HEADERGUARDMACRO
...original content of header — other #include lines, etc...
#endif /* HEADERGUARDMACRO */
A good practice is to only put #includes in an include file if the include file needs them. If the definitions in a given include file are only used in the .c file then include it only in the .c file.
In your case, i would include it in the include file between the #ifdef/#endif.
This will minimize dependencies so that files that don't need a given include won't have to be recompiled if the include file changes.
Usually, library developers protect their includes from multiple including with the #ifndef /#define / #endif "trick" so you don't have to do it.
Of course, you should check... but anyways the compiler will tell you at some point ;-) It is anyhow a good practice to check for multiple inclusions since it slows down the compilation cycle.
During compilation preprocessor just replaces #include directive by specified file content.
To prevent endless loop it should use
#ifndef SOMEIDENTIFIER
#define SOMEIDENTIFIER
....header file body........
#endif
If some header was included into another header which was included to your file
than it is not necessary to explicitly include it again, because it will be included into the file recursively
Yes it is necessary or the compiler will complain when it tries to compile code that it is not "aware" of. Think of #include's are a hint/nudge/elbow to the compiler to tell it to pick up the declarations, structures etc in order for a successful compile. The #ifdef/#endif header trick as pointed out by jldupont, is to speed up compilation of code.
It is used in instances where you have a C++ compiler and compiling plain C code as shown here
Here is an example of the trick:
#ifndef __MY_HEADER_H__
#define __MY_HEADER_H__
#ifdef __cplusplus
extern "C" {
#endif
/* C code here such as structures, declarations etc. */
#ifdef __cplusplus
}
#endif
#endif /* __MY_HEADER_H__ */
Now, if this was included multiple times, the compiler will only include it once since the symbol __MY_HEADER_H__ is defined once, which speeds up compilation times.
Notice the symbol cplusplus in the above example, that is the normal standard way of coping with C++ compiling if you have a C code lying around.
I have included the above to show this (despite not really relevant to the poster's original question).
Hope this helps,
Best regards,
Tom.
PS: Sorry for letting anyone downvote this as I thought it would be useful tidbit for newcomers to C/C++. Leave a comment/criticisms etc as they are most welcome.
You need to include the header from your header, and there's no need to include it in the .c. Includes should go after the #define so they are not unnecessarily included multiple times. For example:
/* myHeader.h */
#ifndef MY_HEADER_H
#define MY_HEADER_H
#include <glib.h>
struct S
{
gchar c;
};
#endif /* MY_HEADER_H */
and
/* myCode.c */
#include "myHeader.h"
void myFunction()
{
struct S s;
/* really exciting code goes here */
}
I use the following construct to be sure, that the needed include file is included before this include. I include all header files in source files only.
#ifndef INCLUDE_FILE_H
#error "#include INCLUDE.h" must appear in source files before "#include THISFILE.h"
#endif
What I normally do is make a single include file that includes all necessary dependencies in the right order. So I might have:
#ifndef _PROJECT_H_
#define _PROJECT_H_
#include <someDependency.h>
#include "my_project_types.h"
#include "my_project_implementation_prototypes.h"
#endif
All in project.h. Now project.h can be included anywhere with no order requirements but I still have the luxury of having my dependencies, types, and API function prototypes in different headers.
Just include all external headers in one common header file in your project, e.g. global.h and include it in all your c files:
It can look like this:
#ifndef GLOBAL_GUARD
#define GLOBAL_GUARD
#include <glib.h>
/*...*/
typedef int YOUR_INT_TYPE;
typedef char YOUR_CHAR_TYPE;
/*...*/
#endif
This file uses include guard to avoid multiple inclusions, illegal multiple definitions, etc.