Should variable definition be in header files? - c

My very basic knowledge of C and compilation process has gone rusty lately. I was trying to figure out answer to the following question but I could not connect compilation, link and pre-processing phase basics. A quick search on the Google did not help much either. So, I decided to come to the ultimate source of knowledge :)
I know: Variables should not be defined in the .h files. Its ok to declare them there.
Why: Because a header file might get included from multiple places, thus redefining the variable more than one time (Linker gives the error).
Possible work-around: Use header-guards in header files and define variable in that.
Is it really a solution: No. Because header-guards are for preprocessing phase. That is to tell compiler that this part has been already included and do not include it once again. But our multiple definition error comes in the linker part - much after the compilation.
This whole thing has got me confused about how preprocessing & linking work. I thought that preprocessing will just not include the code, if the header guard symbol has been defined. In that case, shouldn't multiple definition of a variable problem also get solved?
What happens that these preprocessing directives save the compilation process from redefining symbols under header guards, but the linker still gets multiple definitions of the symbol?

One thing that I've used in the past (when global variables were in vogue):
var.h file:
...
#ifdef DEFINE_GLOBALS
#define EXTERN
#else
#define EXTERN extern
#endif
EXTERN int global1;
EXTERN int global2;
...
Then in one .c file (usually the one containing main()):
#define DEFINE_GLOBALS
#include "var.h"
The rest of the source files just include "var.h" normally.
Notice that DEFINE_GLOBALS is not a header guard, but rather allows declaring/defining the variables depending on whether it is defined. This technique allows one copy of the declarations/definitions.

Header guard protects you from multiple inclusions in a single source file, not from multiple source files. I guess your problem stems from not understanding this concept.
It is not that pre-processor guards are saving during the compile time from this problem. Actually during compile time, one only source file gets compiled into an obj, symbol definitions are not resolved. But, in case of linking when the linker tries to resolve the symbol definitons, it gets confused seeing more than one definition casuing it to flag the error.

You have two .c files. They get compiled separately. Each one includes your header file. Once. Each one gets a definition. They conflict at link time.
The conventional solution is:
#ifdef DEFINE_SOMETHING
int something = 0;
#endif
Then you #define DEFINE_SOMETHING in only one .c file.

Header guards stop a header file being included multiple times in the same translation unit (i.e. in the same .c source file). They have no effect if you include the file in two or more translation units.

Related

is it possible to have only header file in C without source file

I would like to write a C library with fast access by including just header files without using compiled library. For that I have included my code directly in my header file.
The header file contains:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#ifndef INC_TEST_H_
#define INC_TEST_H_
void test(){
printf("hello\n");
}
#endif
My program doesn't compile because I have multiple reference to function test(). If I had a correct source file with my header it works without error.
Is it possible to use only header file by including code inside in a C app?
Including code in a header is generally a really bad idea.
If you have file1.c and file2.c, and in each of them you include your coded.h, then at the link part of the compilation, there will be 2 test functions with global scope (one in file1.c and the other one in file2.c).
You can use the word "static" in order to say that the function will be restricted so it is only visible in the .c file which includes coded.h, but then again, it's a bad idea.
Last but not least: how do you intend to make a library without a .so/.a file? This is not a library; this is copy/paste code directly in your project.
And when a bug is found in your "library", you will be left with no solution apart correcting your code, redispatch it in every project, and recompile every project, missing the very point of a dynamic library: The ability to "just" correct the library without touching every program using it.
If I understand what you're asking correctly, you want to create a "library" which is strictly source code that gets #incuded as necessary, rather than compiled separately and linked.
As you have discovered, this is not easy when you're dealing with functions - the compiler complains of multiple definitions (you will have the same problem with object definitions).
You have a couple of options at this point.
You could declare the function static:
static void test( void )
{
...
}
The static keyword limits the function's visibility to the current translation unit, so you don't run into multiple definition errors at link time. It means that each translation unit is creating its own separate "instance" of the function, leading to a bit of code bloat and slightly longer build times. If you can live with that, this is the easiest solution.
You could use a macro in place of a function:
#define TEST() (printf( "hello\n" ))
except that macros are not functions and do not behave like functions. While macro-based "libraries" do exist, they are not trivial to implement correctly and require quite a bit of thought. Remember that macro arguments are not evaluated, they're just expanded in place, which can lead to problems if you pass expressions with side effects. The classic example is:
#define SQUARE(x) ((x)*(x))
...
y = SQUARE(z++);
SQUARE(z++) expands to ((z++)*(z++)), which leads to undefined behavior.
Separate compilation is a Good Thing, and you should not try to avoid it. Doing everything in one source file is not scalable, and leads to maintenance headaches.
My program do not compiled because I have multiple reference to test() function
That is because the .h file with the function is included and compiled in multiple C source files. As a result, the linker encounters the function with global scope multiple times.
You could have defined the function as static, which means it will have scope only for the curent compilation unit, so:
static void test()
{
printf("hello\n");
}

Best practice for using includes in C

I am learning C and I am unsure where to include files. Basically I can do this in .c or in .h files:
Option 1
test.h
int my_func(char **var);
test.c
#include <stdio.h>
#include "test.h"
int my_func(char **var) {printf("%s\n", "foo");}
int main() {...}
Option 2
test.h
#include <stdio.h>
int my_func(char **var);
test.c
#include "test.h"
int my_func(char **var) {printf("%s\n", "foo");}
int main() {...}
With option 2 I would only need to include test.h in whatever .c file I need the library. Most of the examples I see use option 1.
Are there some general rules when to do what or is this a question of personal preferences?
Don't use includes, you don't need.
I'd choose something like "Option 1". Why "something like" ? Because I'd create a separate file for the main and I'd keep all declaraions inside the .h and all definitions inside the corresponding .c.
Of course, both options are valid.
If you want to include only one header in your main, you can just create a header file, containing only includes - that's a common practice. This way, you can include only one header, instead of several.
I tend to prefer Option 1, as cyclic dependencies will come and bite you very quickly in option 2, and reducing the input size is the best way to guarantee faster compile times. Option 2 tends towards including everything everywhere, whether you really need it or not.
That said, it might be best to experiment a little with what works for structuring your projects. Hard and fast rules tend to not apply universally to these kinds of questions.
both options are correct. the C standard allows both solutions
All C standard headers must be made such that they can be included several times and in any order:
Standard headers may be included in any order; each may be included
more than once in a given scope, with no effect different from being
included only once
(From Preprocessor #ifndef)
I don't think that there is a universal rule (from a "grammatical" point of view, both your options are correct and will work). What is often done is to include in the .h file the headers you need for the library (as in your option 1), either because you'll need them when working with the library (thus avoiding always including the same set of header files in your .c files, which is more error prone), or because they are mentioned in the .h file itself (e.g., if you use a int32_t as type in the function prototypes of the .h files, you will of course need to include <stdint.h> in the .h file).
I prefer to use includes in c file.
If your program is getting bigger you might forgot to include something in one header file, but it is included in one other you use.
By including them in c-file you won't lose includes, while editing other files.
I prefer Option 1. I want to know what I used in my project, and in much time, Option 1 works more effective than Option 2 in time and efficiency.
There is no rule specifying you have following a particular fashion. Even if you include / not include in test.c file, it is not going to bother much, provided you include it in test.h file and include that test.h file in test.c. Hope you are clear with that.
This is because, you have preprocessor directives like #ifndef, #define, #endif. They are called Include guards. These are used in the inbuild header files. However, when you include a file written by you, either go with your option 2, or use include guards to be safe.
The include guards work as follows.
#ifndef ANYTHING
#define ANYTHING
..
..
..
#endif
So when you include for the first time, ANYTHING is not yet defined. So ifndef returns true, then ANYTHING gets defined, and so...on.. But the next time if you include the same file ( by mistake) ifndef would return a false since ANYTHING is now defined and so the file would not be included at all. This protection is necessary to avoid duplicate declarations of variable present in the header file. That would give you a compilation error.
Hope that helps
Cheers

why should extern declaration be outside .c file ( as per linux coding style )

As per checkpatch.pl script "extern declaration be outside .c file"
(used to examine if a patch adheres coding style)
Note: this works perfectly fine without compilation warnings
The issue is solved by placing the extern declaration in .h file.
a.c
-----
int x;
...
b.c
----
extern int x;
==>checkpatch complains
a.h
-----
extern int x;
a.c
----
int x;
b.c
----
#include "a.h"
==> does not complain
I want to understand why this is better
My speculation.
Ideally the code is split into files so as to modularize the code (each file is a module)
The interface exported by the module is placed in the header files so that other modules (or .c files) can include them. so if any module wants to expose some variables externally, then one must add an extern declaration in a Header file corresponding to the module.
Again, having a header file corresponding to each module (.c file) seems like
to many header files to have.
It would be even better to include the a.h in the a.c file as well. That way the compiler can verify that the declaration and the definition match each other.
a.h
-----
extern int x;
a.c
----
#include "a.h" <<--- add this
int x;
b.c
----
#include "a.h"
The reason for the rule is, as you assume, that we should use the compiler to check what we are doing. It is much better with the tiny details.
If we allow extern declarations all over the place, we get in trouble if we ever want to change x to some other type. How many .c files do we have to scan to find all extern int x? Lots. And if we do, we will likely find some extern char x bugs as well. Oops!
Just having one declaration in a header file, and include it where needed, saves us a lot of trouble. In any real project, x will not be the only element in the header file anyway, so you are not saving on the file count.
I see two reasons:
If you share a variable, it's because it's not in your own file, so you want to make it clear that it's shared by adding the extern to a header file - that way, there is only one place [the include directory] to search for extern declarations.
It avoids someone making an extern declaration, and then someone else making a different (as in using different type or attributes) extern declaration for the same thing. At least if it's in a header file [that is relevant], all files use the same declaration.
If you ever decide to change the type, there are only two places to change. If you were to add a "c.c" file that also use the same variable, and then decide that int is not good enough, I need long, you'd have to modify all three places, rather than two as you'd have if there was a header file included in each of "a.c", "b.c" and "c.c".
Having a header file for your module is definitely not a bad idea. But it could of course be acceptable, depending on the circumstances to put the extern into some existing headerfile.
An alternative, that is quite often a better choice than using an extern, is to have a getter function, that fetches your variable for you. That way, the variable can be static in its own source file [no "namespace pollution", and the type of the variable is also much more well defined - the compiler can detect if you are trying to use it wrongly.
Edit: I should point out that Linux coding style is the way it is for "good" reasons, but it doesn't mean that code that isn't part of the Linux source code can't break those rules in various ways. I certainly don't write my own code using the formatting of Linux - I like extra { } around single statements, and I (nearly) always put { on a new line, in line with whatever the brace belongs to, and the } in the same column again.
One reason I always place the extern declarations in the .h is to prevent code duplication, especially if there are, or may be, more bits of code using your "a.c" code and having to access the "x". In that case all files would have to have the extern declaration.
Another reason is that the extern declaration is part of the interface of the module and as such I would keep it, together with any other interface information in the header file.
Your speculation is right: for maximal code reuse and consistency, the (public) declarations must be put into header files.
Again, having a header file corresponding to each module (.c file) seems like to many header files to have.
Then get used to it. It's a logical concept and a good practice to adapty
You have got the reason right as to why extern declarations must be placed in a header file. So, that they can be accessed across different translation units easily.
Also, it is not necessary that each .c file should have a corresponding .h file. One .h file can correspond to a decent number of .c files depending upon your module segregation design.
Again, having a header file corresponding to each module (.c file) seems like to0 many header files to have.
As you have said, the idea of a header file is simple. They contain the public interface that a module wants to export (make available) to other modules (contained in other .c files). This can include structures and types and function declarations. Now, if a module defines a variable which it wants to make available to other modules, it makes sense for it to be included with it's other public parts in the header file. This is why externs end up in th header file. They are just a part of the things that the module wants to make public. Then anyone can include this public interface by simply including the header file.
Having a .h file per .c file may seem like much, but it may be the right thing to do. But keep in mind that a module may implement its code in multiple .c files, and choose to export its aggregate public interface in a single .h file. So, it is not really a strict one to one thing. The real abstraction is that of the public interface offered by a module.

C literal constants : in header or C file?

I'd like to include in na single static C program a bunch of data (say, images, but also other data, embedded in executable since I'm working on an embedded platform without files).
Thus, I wrote a little img2c creating const data from my data files, creating a file with static const arrays initializers to be put to flash (using C99 nice features)
My question is, should I put them in a .h file, like I've seen many times - by example gimp can save as .h files, not .c files - or in a .c file, referenced in a header with just the const extern declaration for further references, without having to include all data and pass it all to the compiler, and redeclare it each time I use it ?
Preprocessor macros are out of the question, since I'll reference their address, not include the whole data each time.
If you put the data in a header every compilation unit that pulls in that header will get its own copy of the data. Imagine two .c files that each go to a .o. Each .o will have a copy of the data and your final executable can be bigger than it needs to be.
If you put it in a .c and extern it in a header, only the one .o will contain the data and your final executable can be smaller. Also, if you change things the recompile can be quicker if it's just a change to a single .c rather than all the .c files that include your header.
As you noted, you may also run into problems with the linker, as symbols will be defined multiple times, see the answers to Repeated Multiple Definition Errors from including same header in multiple cpps. It's going to be better all around to put an extern in the header and the data in a .c
Header files in C are nothing special; the .h extension won't change how the compiler handles them. It's more of a hint for humans "this file probably doesn't contain any code".
So if you put actual binary data in there, the compiler will create a copy of the array in each file in which you include the header (instead of simply adding a reference to a shared global array).
GIMP creates a header file because it doesn't know how you plan to use the data. The idea is that you'll include this header file exactly once in a .c file which then processes the data in some way. If it wrote a .c file and you made changes to the code, GIMP would have to merge the changes when you ask it to update the data - it would be messy.
As with everything in C, there is some debate as to best practice here. Common practice is to put the actual values in your implementation (.c) and the declarations (extern something something) in the header (.h). That way, you can update the values without having to recompile every file that includes the header.
The answer is almost never "redeclare it each time I use it."
This can be done by making sure that the variable is only defined in a single source file. For this a little preprocessor "programing" is needed.
Header file:
/* Standard include guard */
#ifndef X_H
#define X_H
#ifdef X_SOURCE
uint8_t data[] = { /* ... */ };
#else
extern uint8_t data[];
#endif
#endif /* End of include guard */
Source file:
#define X_SOURCE
#include "x.h"
/* ... */
All other source files just need to include the file "x.h" and they can reference data.

What's the difference between using extern and #including header files?

I am beginning to question the usefulness of "extern" keyword which is used to access variables/functions in other modules(in other files). Aren't we doing the same thing when we are using #include preprocessor to import a header file with variables/functions prototypes or function/variables definitions?
extern is needed because it declares that the symbol exists and is of a certain type, and does not allocate storage for it.
If you do:
int foo;
In a header file that is shared between several source files, you will get a linker error because each source would have its own copy of foo created and the linker will be unable to resolve the symbol.
Instead, if you have:
extern int foo;
In the header, it would declare a symbol that is defined elsewhere in each source file.
One (and only one) source file would contain
int foo;
which creates a single instance of foo for the linker to resolve.
No. The #include is a preprocessor command that says "put all of the text from this other file right here". So, all of the functions and variables in the included file are defined in the current file.
The #include preprocessor directive simply copy/pastes the text of the included file into the current position in the current file.
extern marks that a variable or function exists externally to this source file. This is done by the originator ("I am making this data available externally"), and by the recipient ("I am marking that there is external data I need"). A recipient with an unsatisfied extern will cause an Undefined Symbol error.
Which to use? I prefer using #include with the include guard pattern:
#ifndef HEADER_NAME_H
#define HEADER_NAME_H
<write your header code here>
#endif
This pattern allows you to cleanly separate anything you want an outsider to have access to into the header, without worrying about a double-include error. Any time I have to open a .c file to find what externs are available, the lack of a clear interface makes my soul gem crack.
There are indeed two ways of using functions/variables across translation units (a translation unit is usually a *.c/*.cc file).
One is the forward declaration:
Declare functions/variables using extern in the calling file. extern is actually optional for functions (functions are automatically extern), but not for variables.
Implement the function/variables in the implementing file.
The other is using header files:
Declare functions/variables using extern in a header file (*.h/*.hh). Still, extern is optional for functions, but not for variables. So you don't normally see extern before functions in header files.
In the calling *.c/*.cc file, #include the header, and call the function/variable as needed.
In the implementing *.c/*.cc file, #include the header, and implement the function/variable.
Google C++ style guide has some good discussions on the pros and cons of the two approaches.
Personally, I would prefer the header file approach, as it is the single place (the header file) a function signature is defined, calling and implementation all adhere to this one piece of definition. Thus, there would be no unnecessary discrepancies that might occur in the forward declaration approach.

Resources