What's the difference between using extern and #including header files? - c

I am beginning to question the usefulness of "extern" keyword which is used to access variables/functions in other modules(in other files). Aren't we doing the same thing when we are using #include preprocessor to import a header file with variables/functions prototypes or function/variables definitions?

extern is needed because it declares that the symbol exists and is of a certain type, and does not allocate storage for it.
If you do:
int foo;
In a header file that is shared between several source files, you will get a linker error because each source would have its own copy of foo created and the linker will be unable to resolve the symbol.
Instead, if you have:
extern int foo;
In the header, it would declare a symbol that is defined elsewhere in each source file.
One (and only one) source file would contain
int foo;
which creates a single instance of foo for the linker to resolve.

No. The #include is a preprocessor command that says "put all of the text from this other file right here". So, all of the functions and variables in the included file are defined in the current file.

The #include preprocessor directive simply copy/pastes the text of the included file into the current position in the current file.
extern marks that a variable or function exists externally to this source file. This is done by the originator ("I am making this data available externally"), and by the recipient ("I am marking that there is external data I need"). A recipient with an unsatisfied extern will cause an Undefined Symbol error.
Which to use? I prefer using #include with the include guard pattern:
#ifndef HEADER_NAME_H
#define HEADER_NAME_H
<write your header code here>
#endif
This pattern allows you to cleanly separate anything you want an outsider to have access to into the header, without worrying about a double-include error. Any time I have to open a .c file to find what externs are available, the lack of a clear interface makes my soul gem crack.

There are indeed two ways of using functions/variables across translation units (a translation unit is usually a *.c/*.cc file).
One is the forward declaration:
Declare functions/variables using extern in the calling file. extern is actually optional for functions (functions are automatically extern), but not for variables.
Implement the function/variables in the implementing file.
The other is using header files:
Declare functions/variables using extern in a header file (*.h/*.hh). Still, extern is optional for functions, but not for variables. So you don't normally see extern before functions in header files.
In the calling *.c/*.cc file, #include the header, and call the function/variable as needed.
In the implementing *.c/*.cc file, #include the header, and implement the function/variable.
Google C++ style guide has some good discussions on the pros and cons of the two approaches.
Personally, I would prefer the header file approach, as it is the single place (the header file) a function signature is defined, calling and implementation all adhere to this one piece of definition. Thus, there would be no unnecessary discrepancies that might occur in the forward declaration approach.

Related

Why is extern required for global variable on Linux but not Mac when compiling shared object? [duplicate]

My question is about when a function should be referenced with the extern keyword in C.
I am failing to see when this should be used in practice. As I am writing a program all of the functions that I use are made available through the header files I have included. So why would it be useful to extern to get access to something that was not exposed in the header file?
I could be thinking about how extern works incorrectly, and if so please correct me.
Also.. Should you extern something when it is the default declaration without the keyword in a header file?
extern changes the linkage. With the keyword, the function / variable is assumed to be available somewhere else and the resolving is deferred to the linker.
There's a difference between extern on functions and on variables.
For variables it doesn't instantiate the variable itself, i.e. doesn't allocate any memory. This needs to be done somewhere else. Thus it's important if you want to import the variable from somewhere else.
For functions, this only tells the compiler that linkage is extern. As this is the default (you use the keyword static to indicate that a function is not bound using extern linkage) you don't need to use it explicitly.
extern tells the compiler that this data is defined somewhere and will be connected with the linker.
With the help of the responses here and talking to a few friends here is the practical example of a use of extern.
Example 1 - to show a pitfall:
stdio.h:
int errno;
myCFile1.c:
#include <stdio.h>
// Code using errno...
myCFile2.c:
#include <stdio.h>
// Code using errno...
If myCFile1.o and myCFile2.o are linked, each of the c files have separate copies of errno. This is a problem as the same errno is supposed to be available in all linked files.
Example 2 - The fix.
stdio.h:
extern int errno;
stdio.c:
int errno;
myCFile1.c:
#include <stdio.h>
// Code using errno...
myCFile2.c:
#include <stdio.h>
// Code using errno...
Now if both myCFile1.o and MyCFile2.o are linked by the linker they will both point to the same errno. Thus, solving the implementation with extern.
It has already been stated that the extern keyword is redundant for functions.
As for variables shared across compilation units, you should declare them in a header file with the extern keyword, then define them in a single source file, without the extern keyword. The single source file should be the one sharing the header file's name, for best practice.
Many years later, I discover this question. After reading every answer and comment, I thought I could clarify a few details... This could be useful for people who get here through Google search.
The question is specifically about using extern functions, so I will ignore the use of extern with global variables.
Let's define 3 function prototypes:
// --------------------------------------
// Filename: "my_project.H"
extern int function_1(void);
static int function_2(void);
int function_3(void);
The header file can be used by the main source code as follows:
// --------------------------------------
// Filename: "my_project.C"
#include "my_project.H"
void main(void) {
int v1 = function_1();
int v2 = function_2();
int v3 = function_3();
}
int function_2(void) return 1234;
In order to compile and link, we must define function_2 in the same source code file where we call that function. The two other functions could be defined in different source code *.C or they may be located in any binary file (*.OBJ, *.LIB, *.DLL), for which we may not have the source code.
Let's include again the header my_project.H in a different *.C file to understand better the difference. In the same project, we add the following file:
// --------------------------------------
// Filename: "my_big_project_splitted.C"
#include "my_project.H"
void old_main_test(void){
int v1 = function_1();
int v2 = function_2();
int v3 = function_3();
}
int function_2(void) return 5678;
int function_1(void) return 12;
int function_3(void) return 34;
Important features to notice:
When a function is defined as static in a header file, the compiler/linker must find an instance of a function with that name in each module which uses that include file.
A function which is part of the C library can be replaced in only one module by redefining a prototype with static only in that module. For example, replace any call to malloc and free to add memory leak detection feature.
The specifier extern is not really needed for functions. When static is not found, a function is always assumed to be extern.
However, extern is not the default for variables. Normally, any header file that defines variables to be visible across many modules needs to use extern. The only exception would be if a header file is guaranteed to be included from one and only one module.
Many project managers would then require that such variable be placed at the beginning of the module, not inside any header file. Some large projects, such as the video game emulator "Mame" even require that such variables appears only above the first function using them.
In C, extern is implied for function prototypes, as a prototype declares a function which is defined somewhere else. In other words, a function prototype has external linkage by default; using extern is fine, but is redundant.
(If static linkage is required, the function must be declared as static both in its prototype and function header, and these should normally both be in the same .c file).
A very good article that I came about the extern keyword, along with the examples: http://www.geeksforgeeks.org/understanding-extern-keyword-in-c/
Though I do not agree that using extern in function declarations is redundant. This is supposed to be a compiler setting. So I recommend using the extern in the function declarations when it is needed.
If each file in your program is first compiled to an object file, then the object files are linked together, you need extern. It tells the compiler "This function exists, but the code for it is somewhere else. Don't panic."
All declarations of functions and variables in header files should be extern.
Exceptions to this rule are inline functions defined in the header and variables which - although defined in the header - will have to be local to the translation unit (the source file the header gets included into): these should be static.
In source files, extern shouldn't be used for functions and variables defined in the file. Just prefix local definitions with static and do nothing for shared definitions - they'll be external symbols by default.
The only reason to use extern at all in a source file is to declare functions and variables which are defined in other source files and for which no header file is provided.
Declaring function prototypes extern is actually unnecessary. Some people dislike it because it will just waste space and function declarations already have a tendency to overflow line limits. Others like it because this way, functions and variables can be treated the same way.
Functions actually defined in other source files should only be declared in headers. In this case, you should use extern when declaring the prototype in a header.
Most of the time, your functions will be one of the following (more like a best practice):
static (normal functions that aren't
visible outside that .c file)
static inline (inlines from .c or .h
files)
extern (declaration in headers of the
next kind (see below))
[no keyword whatsoever] (normal
functions meant to be accessed using
extern declarations)
When you have that function defined on a different dll or lib, so that the compiler defers to the linker to find it. Typical case is when you are calling functions from the OS API.

Should a function prototype always be in its header file?

Lets say we have a few C source files such as file1.c, file2.c and main.c. We have functions as:
file1.c
|---> file1Func1()
|---> file1Func2()
file2.c
|---> file2Func1()
|---> file2Func2()
and the main file uses these functions. Now it would be natural that I create and add respective function prototype in header files file1.h and file2.h, then include these headers in main.c to use the functions.
What if I have a very large project with over thousand source (C) files, should I always create a header (then add function prototype) for every source file. Then include the header to use the functions?
Or using extern for using a function defined elsewhere (in some other source file) and rely on linker to search and fetch the function from the object file during link time?
Note: using the latter approach triggers MISRA warning of no function prototype.
All functions that are part of the interface, that is functions which is called by another module, should have function prototypes in the header file. Preferably together with comments documenting how that function should be used.
Functions that are not part of the interface and only used internally within the file should not have a prototype in the header. For such functions, declare the prototype at the top of the c file, and declare it as static.
This is how all (professional) C programs are written. As a side-note, this sound design is also required by MISRA-C.
There should never be a reason for you to use the extern keyword for functions. Note that a function prototype like
void func (void);
is completely equivalent to
extern void func (void);
If you need to use a function, include the relevant header.
What if I have a very large project with over thousand source (c) files, should I always create a header (then add function prototype) for every source file. Then include the header to use the functions?
The short answer is "Yes".
The slightly longer answer is "Yes but you may omit functions from header files that are implementation details of other functions in a source file".
Declaring functions in header files and #includeing the header files makes sure that function definitions and function calls stay in sync. Otherwise, it is easy to make mistakes and those mistakes are caught at link time instead of at compile time.
should I always create a header (then add function prototype) for
every source file.
The TL;DR; answer is Yes.
My personal opinion (and one that has been written into several company coding standards) is that each C Source file should have its own associated Header file to define the external interface.
Together, the C Source file and its associated Header file define the module - but only the Header file declares the interface.
All global objects (including function prototypes) should be declared in header file; I also advocate that the extern keyword should never(*) be used in a C Source file as this is (IMHO) breaking the declared interface for the module.
{*} OK, never is a strong word, and there may be exceptions... but they should be few and far between.

Do all C functions need to be declared in a header file

Do I need to declare all functions I use in a .c file in a header file, or can I just declare and define right there in the .c file? If so, does a definition in the .c file in this case count as the declaration also?
Do I need to declare all functions I use in a .c file in a header file,
or can I just declare and define right there in the .c file?
You used "use" in the first question and "define" in the next question. There is a difference.
void foo()
{
bar(10);
}
Here, foo is defined and bar is used. You should declare bar. If you don't declare bar, the compiler makes assumptions about its return type.
You can declare bar in the .c file or add the declaration in a .h file and #include the .h file in the .c file. Whether you use the first method or the second method is up to you. If you use the declaration in more than one .c file, it is better to put that in a .h file.
You can define foo without a declaration.
If so, does a definition in the .c file in this case count as the declaration also?
Every function definition counts as a declaration too.
For the compiler, it does not matter if a declaration occurs in a .h or a .c file, because the compiler sees the preprocessed form.
For the human developer reading and contributing to your code, it is much better (to avoid copy&pasting the same declaration twice) to put the declaration of any function used in more than one translation unit (i.e. .c file) in some #include-d header.
And you can define a function before using it.
BTW, you might even avoid declaring a function that you are calling (it defaults to returning int for legacy purposes), but this is poor taste and obsolete way of coding (and most compilers can emit a warning in that case).
No, it is not necessary.
The reason of the header files is to separate the interface from the implementation. The header declares "what" a class (or whatever is being implemented) will do, while the .c file defines "how" it will perform those features.
This reduces dependencies so that code that uses the header doesn't necessarily need to know all the details of the implementation and any other classes/headers needed only for that. This will reduce compilation times and also the amount of recompilation needed when something in the implementation changes.
The answer to both questions is yes. You can declare c-functions in both header and .c file. Same with definition. However, if you are defining it in header file, you may have slight problems during compilation.
By default functions have external linkage. It means that it is supposed that functions potentially will be used in several compilation units.
However sometimes some auxiliary functions that form implementations of other functions are not designed to be used in numerous compilation units. Such functions declared with keyword static have internal linkage.
Usually they are declared and defined inside some .c module and are not visible in other compilation units.
One occasion that requires functions to be declared in a separate header is when one is creating a library for other developers to use. Some libraries are distributed as closed source and they are provided to you as a library file (*.dll / *.so ...) and a header.
The header file would contain declarations of all publicly accessible functions and definitions of all publicly required structures, enums and datatypes etc.
Without this header file the 3rd party library user would not know how to interface with the library file and thus would not be able to link against it.
But for small, trivial C programs that are not intended for use by other people, no you can just dump everything into a C file and build it. Although you might curse yourself years later when you need to maintain that code :)

why should extern declaration be outside .c file ( as per linux coding style )

As per checkpatch.pl script "extern declaration be outside .c file"
(used to examine if a patch adheres coding style)
Note: this works perfectly fine without compilation warnings
The issue is solved by placing the extern declaration in .h file.
a.c
-----
int x;
...
b.c
----
extern int x;
==>checkpatch complains
a.h
-----
extern int x;
a.c
----
int x;
b.c
----
#include "a.h"
==> does not complain
I want to understand why this is better
My speculation.
Ideally the code is split into files so as to modularize the code (each file is a module)
The interface exported by the module is placed in the header files so that other modules (or .c files) can include them. so if any module wants to expose some variables externally, then one must add an extern declaration in a Header file corresponding to the module.
Again, having a header file corresponding to each module (.c file) seems like
to many header files to have.
It would be even better to include the a.h in the a.c file as well. That way the compiler can verify that the declaration and the definition match each other.
a.h
-----
extern int x;
a.c
----
#include "a.h" <<--- add this
int x;
b.c
----
#include "a.h"
The reason for the rule is, as you assume, that we should use the compiler to check what we are doing. It is much better with the tiny details.
If we allow extern declarations all over the place, we get in trouble if we ever want to change x to some other type. How many .c files do we have to scan to find all extern int x? Lots. And if we do, we will likely find some extern char x bugs as well. Oops!
Just having one declaration in a header file, and include it where needed, saves us a lot of trouble. In any real project, x will not be the only element in the header file anyway, so you are not saving on the file count.
I see two reasons:
If you share a variable, it's because it's not in your own file, so you want to make it clear that it's shared by adding the extern to a header file - that way, there is only one place [the include directory] to search for extern declarations.
It avoids someone making an extern declaration, and then someone else making a different (as in using different type or attributes) extern declaration for the same thing. At least if it's in a header file [that is relevant], all files use the same declaration.
If you ever decide to change the type, there are only two places to change. If you were to add a "c.c" file that also use the same variable, and then decide that int is not good enough, I need long, you'd have to modify all three places, rather than two as you'd have if there was a header file included in each of "a.c", "b.c" and "c.c".
Having a header file for your module is definitely not a bad idea. But it could of course be acceptable, depending on the circumstances to put the extern into some existing headerfile.
An alternative, that is quite often a better choice than using an extern, is to have a getter function, that fetches your variable for you. That way, the variable can be static in its own source file [no "namespace pollution", and the type of the variable is also much more well defined - the compiler can detect if you are trying to use it wrongly.
Edit: I should point out that Linux coding style is the way it is for "good" reasons, but it doesn't mean that code that isn't part of the Linux source code can't break those rules in various ways. I certainly don't write my own code using the formatting of Linux - I like extra { } around single statements, and I (nearly) always put { on a new line, in line with whatever the brace belongs to, and the } in the same column again.
One reason I always place the extern declarations in the .h is to prevent code duplication, especially if there are, or may be, more bits of code using your "a.c" code and having to access the "x". In that case all files would have to have the extern declaration.
Another reason is that the extern declaration is part of the interface of the module and as such I would keep it, together with any other interface information in the header file.
Your speculation is right: for maximal code reuse and consistency, the (public) declarations must be put into header files.
Again, having a header file corresponding to each module (.c file) seems like to many header files to have.
Then get used to it. It's a logical concept and a good practice to adapty
You have got the reason right as to why extern declarations must be placed in a header file. So, that they can be accessed across different translation units easily.
Also, it is not necessary that each .c file should have a corresponding .h file. One .h file can correspond to a decent number of .c files depending upon your module segregation design.
Again, having a header file corresponding to each module (.c file) seems like to0 many header files to have.
As you have said, the idea of a header file is simple. They contain the public interface that a module wants to export (make available) to other modules (contained in other .c files). This can include structures and types and function declarations. Now, if a module defines a variable which it wants to make available to other modules, it makes sense for it to be included with it's other public parts in the header file. This is why externs end up in th header file. They are just a part of the things that the module wants to make public. Then anyone can include this public interface by simply including the header file.
Having a .h file per .c file may seem like much, but it may be the right thing to do. But keep in mind that a module may implement its code in multiple .c files, and choose to export its aggregate public interface in a single .h file. So, it is not really a strict one to one thing. The real abstraction is that of the public interface offered by a module.

How to correctly use the extern keyword in C

My question is about when a function should be referenced with the extern keyword in C.
I am failing to see when this should be used in practice. As I am writing a program all of the functions that I use are made available through the header files I have included. So why would it be useful to extern to get access to something that was not exposed in the header file?
I could be thinking about how extern works incorrectly, and if so please correct me.
Also.. Should you extern something when it is the default declaration without the keyword in a header file?
extern changes the linkage. With the keyword, the function / variable is assumed to be available somewhere else and the resolving is deferred to the linker.
There's a difference between extern on functions and on variables.
For variables it doesn't instantiate the variable itself, i.e. doesn't allocate any memory. This needs to be done somewhere else. Thus it's important if you want to import the variable from somewhere else.
For functions, this only tells the compiler that linkage is extern. As this is the default (you use the keyword static to indicate that a function is not bound using extern linkage) you don't need to use it explicitly.
extern tells the compiler that this data is defined somewhere and will be connected with the linker.
With the help of the responses here and talking to a few friends here is the practical example of a use of extern.
Example 1 - to show a pitfall:
stdio.h:
int errno;
myCFile1.c:
#include <stdio.h>
// Code using errno...
myCFile2.c:
#include <stdio.h>
// Code using errno...
If myCFile1.o and myCFile2.o are linked, each of the c files have separate copies of errno. This is a problem as the same errno is supposed to be available in all linked files.
Example 2 - The fix.
stdio.h:
extern int errno;
stdio.c:
int errno;
myCFile1.c:
#include <stdio.h>
// Code using errno...
myCFile2.c:
#include <stdio.h>
// Code using errno...
Now if both myCFile1.o and MyCFile2.o are linked by the linker they will both point to the same errno. Thus, solving the implementation with extern.
It has already been stated that the extern keyword is redundant for functions.
As for variables shared across compilation units, you should declare them in a header file with the extern keyword, then define them in a single source file, without the extern keyword. The single source file should be the one sharing the header file's name, for best practice.
Many years later, I discover this question. After reading every answer and comment, I thought I could clarify a few details... This could be useful for people who get here through Google search.
The question is specifically about using extern functions, so I will ignore the use of extern with global variables.
Let's define 3 function prototypes:
// --------------------------------------
// Filename: "my_project.H"
extern int function_1(void);
static int function_2(void);
int function_3(void);
The header file can be used by the main source code as follows:
// --------------------------------------
// Filename: "my_project.C"
#include "my_project.H"
void main(void) {
int v1 = function_1();
int v2 = function_2();
int v3 = function_3();
}
int function_2(void) return 1234;
In order to compile and link, we must define function_2 in the same source code file where we call that function. The two other functions could be defined in different source code *.C or they may be located in any binary file (*.OBJ, *.LIB, *.DLL), for which we may not have the source code.
Let's include again the header my_project.H in a different *.C file to understand better the difference. In the same project, we add the following file:
// --------------------------------------
// Filename: "my_big_project_splitted.C"
#include "my_project.H"
void old_main_test(void){
int v1 = function_1();
int v2 = function_2();
int v3 = function_3();
}
int function_2(void) return 5678;
int function_1(void) return 12;
int function_3(void) return 34;
Important features to notice:
When a function is defined as static in a header file, the compiler/linker must find an instance of a function with that name in each module which uses that include file.
A function which is part of the C library can be replaced in only one module by redefining a prototype with static only in that module. For example, replace any call to malloc and free to add memory leak detection feature.
The specifier extern is not really needed for functions. When static is not found, a function is always assumed to be extern.
However, extern is not the default for variables. Normally, any header file that defines variables to be visible across many modules needs to use extern. The only exception would be if a header file is guaranteed to be included from one and only one module.
Many project managers would then require that such variable be placed at the beginning of the module, not inside any header file. Some large projects, such as the video game emulator "Mame" even require that such variables appears only above the first function using them.
In C, extern is implied for function prototypes, as a prototype declares a function which is defined somewhere else. In other words, a function prototype has external linkage by default; using extern is fine, but is redundant.
(If static linkage is required, the function must be declared as static both in its prototype and function header, and these should normally both be in the same .c file).
A very good article that I came about the extern keyword, along with the examples: http://www.geeksforgeeks.org/understanding-extern-keyword-in-c/
Though I do not agree that using extern in function declarations is redundant. This is supposed to be a compiler setting. So I recommend using the extern in the function declarations when it is needed.
If each file in your program is first compiled to an object file, then the object files are linked together, you need extern. It tells the compiler "This function exists, but the code for it is somewhere else. Don't panic."
All declarations of functions and variables in header files should be extern.
Exceptions to this rule are inline functions defined in the header and variables which - although defined in the header - will have to be local to the translation unit (the source file the header gets included into): these should be static.
In source files, extern shouldn't be used for functions and variables defined in the file. Just prefix local definitions with static and do nothing for shared definitions - they'll be external symbols by default.
The only reason to use extern at all in a source file is to declare functions and variables which are defined in other source files and for which no header file is provided.
Declaring function prototypes extern is actually unnecessary. Some people dislike it because it will just waste space and function declarations already have a tendency to overflow line limits. Others like it because this way, functions and variables can be treated the same way.
Functions actually defined in other source files should only be declared in headers. In this case, you should use extern when declaring the prototype in a header.
Most of the time, your functions will be one of the following (more like a best practice):
static (normal functions that aren't
visible outside that .c file)
static inline (inlines from .c or .h
files)
extern (declaration in headers of the
next kind (see below))
[no keyword whatsoever] (normal
functions meant to be accessed using
extern declarations)
When you have that function defined on a different dll or lib, so that the compiler defers to the linker to find it. Typical case is when you are calling functions from the OS API.

Resources