When to use include guards? - c

I know that the use of include guards in header files is to prevent something from being defined twice. Using this code sample though, was completely fine:
foo.c
#include <stdio.h>
#include <string.h>
#include "bar.h"
int main() {
printf("%d", strlen("Test String"));
somefunc("Some test string...");
return 0;
}
bar.h
#ifndef BAR_H_INCLUDED
#define BAR_H_INCLUDED
void somefunc(char str[]);
#endif
bar.c
#include <stdio.h>
#include <string.h>
#include "bar.h"
void somefunc(char str[]) {
printf("Some string length function: %d", strlen(str));
}
The above snippets are compiled with gcc -Wall foo.c bar.c -o foo and there is no error. However, both <stdio.h> and <string.h> were included without an include guard. There is still no error when I strip bar.h down to the single statement void somefunc(char str[]);. Why is there no error?

Firstly, the primary purpose of include guards is to prevent something from being declared twice in the same translation unit. The "in the same translation unit" part is the key here. Your experiment with two different translation units has nothing to do with the purpose of include guards. It does not demonstrate anything. It is not even remotely related.
In order to take advantage of include guards you have to include (explicitly or implicitly) the same header file twice into one implementation file.
Secondly, just because some header file has no include guards and just because you included that header file twice into the same translation unit does not mean that it will necessarily trigger errors. In order to lead to errors the header must contain declarations of specific "non-repeatable" kind. No every header contains such offending declarations. Not every declaration is offending in this sense.
Your bar.h (as posted) is actually harmless. Formally, you don't need include guards in your bar.h. It has a single function declaration, which can be repeated many times in one translation units. So, including this header multiple times will not lead to errors.
But add something like that to your bar.h
struct SomeStruct
{
int i;
};
and then just include it twice in the same implementation file, and you will end up with an error. This error is what include guards are intended to prevent. The language prohibits repeating full declarations of the same struct type in the same translation unit.
Include guards are typically placed in header files unconditionally. They are, I'm quite sure, present inside <stdio.h> and <string.h> as well. It is unclear why you claim that these headers "were included without an include guard". Did you check inside these files? In any case, again, your experiment with two different translation units does not demonstrate anything relevant anyway.

Duplicate declarations aren't a problem; duplicate (type) definitions are. If the bar.h header contained, for example:
enum FooBar { FOO, BAR, BAZ, QUX };
then including that twice in a single TU (translation unit — source file plus included headers) would give an error in general.
Also, the multiple inclusion scenario isn't what you show. What might cause the trouble is the following, assuming that there are no header guards in the header files:
bar.h
enum FooBar { FOO, BAR, BAZ, QUX };
void somefunc(char str[]);
quack.h
#include "bar.h"
extern enum FooBar foobar_translate(const char *str);
main.c
#include "bar.h"
#include "quack.h"
…
Note that GCC has an option -Wredundant-decls to identify redundant declarations — where the same declaration is present several times (normally from multiple files, but also if the same declaration is present twice in a single file).
Prior to C11, you could not repeat a typedef (at file scope; you always could hide an outer typedef in a block scope). C11 relaxes that constraint:
§6.7 Declarations
¶3 If an identifier has no linkage, there shall be no more than one declaration of the identifier
(in a declarator or type specifier) with the same scope and in the same name space, except
that:
a typedef name may be redefined to denote the same type as it currently does,
provided that type is not a variably modified type;
tags may be redeclared as specified in 6.7.2.3.
However, you still can't define a structure type twice in a single scope of a TU, so notations such as:
typedef struct StructTag { … } StructTag;
must be protected with header guards. You don't have this problem if you use opaque (incomplete) types:
typedef struct StructTag StructTag;
As to why you can include standard headers, that's because the standard requires that you can:
§7.1.2 Standard headers
¶4 Standard headers may be included in any order; each may be included more than once in
a given scope, with no effect different from being included only once, except that the
effect of including <assert.h> depends on the definition of NDEBUG (see 7.2). If
used, a header shall be included outside of any external declaration or definition, and it
shall first be included before the first reference to any of the functions or objects it
declares, or to any of the types or macros it defines. However, if an identifier is declared
or defined in more than one header, the second and subsequent associated headers may be
included after the initial reference to the identifier. The program shall not have any
macros with names lexically identical to keywords currently defined prior to the inclusion
of the header or when any macro defined in the header is expanded.
Using header guards allows you to make your headers meet the same standard that the standard headers meet.
See also other diatribes (answers) on the general subject, including:
Should I use #include in headers
How do I use extern to share variables between source files — the information is a fair way down my humungous answer in a section on header guards.
Repeated typedefs — invalid in C but valid in C++
How to link multiple implementation files in C
Linking against a static library — contains a script chkhdr which I use to check headers for idempotency and self-containment.

The reason there is no error in your code is that your header file is declaring but not defining somefunc(). Multiple declarations of something are fine, as long as they are not definitions - the compiler can accept seeing something declared more than once (as long as the declarations are compatible, of course).
Generally speaking, include guards are needed to avoid circular dependencies between header files, such as
Header A and header B mutually include each other in some situations. An include guard is needed in at least one of the headers to prevent infinite looping in the preprocessor.
Header A includes header B because it depends on a definition (e.g. of an inline function, a typedef) within it, but other header files OR compilation units may include header A, header B, or both in different circumstances. Include guards are needed to prevent multiple definitions in compilation units that include both those headers. At a minimum, the header file containing the definitions needs to have an include guard.
Since header files can include each other in any order, problems like those addressed above can become quite complicated.
Preventing some types of multiple definition is a side-effect of the above, but is not the primary purpose of include guards.
Not withstanding all the above, the rule of thumb I work by with header files is "use include guards in all header files unless I have a particular reason not to". By doing that, all the potential problems associated with not providing include guards are avoided. The circumstances in which it is necessary to avoid an include guard (such as the header declaring or defining different things, dependent on a macro being defined/undefined in the compilation unit) are relatively rare in practice. And, if you are using such techniques which require such things, you should already know that you should not be using include guards in the affected headers.

Why? Because:
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res);
int char_to_int(char* value, int *res)
{
// do something
}
Function prototypes do not error it they are the same and the function

Related

C: Extern variable declaration and include guards [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I've had the same problem as described in these two posts (First and Second) regarding declaration of variables in header files. The solution listed works well for me, but nonetheless I have a basic question on the solution:
Why would an include guard still not solve this issue? I would expect that the include guard would avoid my variable to be declared multiple times if I include the same header file multiple times.
Include guards are useful for preventing multiple delcarations or type definitions in a single translation unit, i.e. a .c file that is compiled by itself along with all of the headers it includes.
Suppose you have the following headers without include guards:
a.h:
struct test {
int i;
};
struct test t1;
b.h:
#include "a.h"
struct test *get_struct(void);
And the following main file:
main.c:
#include <stdio.h>
#include "a.h"
#include "b.h"
int main()
{
struct test *t = get_struct();
printf("t->i=%d\n", t->i);
return 0;
}
When the preprocessor runs on the, the resulting file will look something like this (neglecting the contents of stdio.h):
struct test {
int i;
};
struct test t1;
struct test {
int i;
};
struct test t1;
struct test *get_struct(void);
int main()
{
struct test *t = get_struct();
printf("t->i=%d\n", t->i);
return 0;
}
Because main.c includes a.h and b.h, and because b.h also includes a.h, the contents of a.h appear twice. This causes struct test to be defined twice which is an error. There is no problem however with the variable t1 because each constitutes a tentative definition, and multiple tentative definitions in a translation unit are combined to refer to a single object defined in the resulting main.o.
By adding include guards to a.h:
#ifndef A_H
#define A_H
struct test {
int i;
};
struct test t1;
#endif
The resulting preprocessor output would be:
struct test {
int i;
};
struct test *get_struct(void);
int main()
{
struct test *t = get_struct();
printf("t->i=%d\n", t->i);
return 0;
}
Preventing the duplicate struct definition.
But now let's look at b.c which constitutes a separate translation unit:
b.c:
#include "b.h"
struct test *get_struct(void)
{
return &t1;
}
After the preprocessor runs we have:
struct test {
int i;
};
struct test t1;
struct test *get_struct(void);
struct test *get_struct(void)
{
return &t1;
}
This file will compile fine since there is one definition of struct test and a tentative definition of t1 gives us an object defined in b.o.
Now we link a.o and b.o. The linker sees that both a.o and b.o contain an object called t1, so the linking fails because it was defined multiple times.
Note here that while the include guards prevent a definition from appearing more than once in a single translation unit, it doesn't prevent it from happening across multiple translation units.
This is why t1 should have an external declaration in a.h:
extern struct test t1;
And a non-extern declaration in one .c file.
Never define data or functions (except static inline) in the header files. Always do it in the .c source files. If you want to make them visible in other compilation units declare them as extern in the header file.
Guards do not protect you if you include the same .h file in many compilation units which are then linked together.
The include guard will protect you of including several times the same include, it will make the definitions in the include to be handled only once, at the first inclusion point. This means the declarations you have made between the protecting marks not be repeated and so don't produce errors about double definition. This is not normally the case of
extern type_of_variable variable_name;
which you can make several times without any complaint from the compiler... it has more to do with type declarations or static functions implementations included in the header.
But why don't you post an example of full compilable code and show what are you trying, and why it doesn't work. From your question I cannot guess what you pretend to do, if something is not working in your case (well, you put references to other cases that probably will have answers, so why don't you use the answers there? what is wrong with the answers given there?)
Please, post a valid example of what is worrying you, and explain precisely why those other posts don't solve your problem (if you have one, at all)
Think that one of the questions doesn't show the actual contents of the repeated include, and doesn't show the actual variable definition. And the other is a question closed 8 years ago, and for some reason it has not been reopened. So you are running the same risk (and I do run the risk of being downvoted for this answer that doesn't actually answer your question, because you don't actually ask something is happening to you -I don't know from your question if you have an actual problem or not).

Are multiple identical prototypes legal?

The following code does not emit any warnings when compiled with both gcc and clang on Linux x64:
#include <stdio.h>
#include <stdlib.h>
void foo(void);
void foo(void);
void foo(void);
int main(void)
{
return 0;
}
IMO, it's legal according to the following snippets from C99:
All declarations that refer to the same object or function shall have
compatible type; otherwise, the behavior is undefined.
(...)
For two function
types to be compatible, both shall specify compatible return types
(...)
Moreover, the parameter type lists, if both are present, shall agree in the
number of parameters and in use of the ellipsis terminator; corresponding
parameters shall have compatible types.
(...)
Two types have compatible type if their types are the same.
Am I right? I want to make sure it is not UB and that my understanding is correct.
Multiple identical prototypes are legal, and in fact common, because it is typical in modern C for a function definition to comprise a prototype for that function, and for there also to be a prototype for the function in scope from inclusion of a header file. That is, given
foo.h:
void foo(int x);
foo.c:
#include "foo.h"
void foo(int x) {
printf("%d\n", x);
}
/* ... */
there are two identical prototypes for foo() in scope in the body of function foo's definition and throughout the rest of the file. This is fine.
It is also ok to have multiple declarations of the same object or function that are not identical, as long as they are compatible. For example, the
declaration
void foo();
declares foo as a function taking unspecified parameters and returning nothing. This declaration is compatible with the ones already present in foo.c and foo.h, and it could be added to either one or both of those files with zero additional effect.
And this all applies to objects (variables), too, where some applications are quite common. For example, if you want to declare a global variable that is accessed from multiple files, then it is common to put a declaration of that variable in a header file. The C source file containing and the definition of that variable -- which is also a declaration -- typically #includes the header, yielding two declarations:
global.h:
extern int global;
global.c:
#include "global.h"
int global = 42;
Or there is the case of forward declaration of compound data types:
struct one;
struct two {
struct one *my_one;
struct two *next;
};
struct one {
struct two *my_two;
}
Note the multiple compatible, but not identical, declarations of struct one. This particular set of data structures cannot be declared at all without multiple declaration of one of the types.

C --> headers & variables

Can the headers files in C include variables?
I am a beginner in programming; started with C, and I know the importance of precision especially in the first steps of the learning process
Including files is done by the preprocessor before even attempting to compile the code and it simply does text replacement – it puts the contents of the included file in the current unit that is going to be passed to the compiler. The compiler then sees the concatenated output and no #include directives at all.
With that said, technically you can include anything that is valid C code.
The good practice, however, is that only type definitions, #defines, function declarations (not definitions) and data declarations (not definitions) should be in a header. A function declaration is also called a prototype and merely specifies the function signature (its return type, name and parameters). Data declarations look very similar to data definitions, but have an extern storage class specifier and cannot be initialised:
extern int a; // declares "a" but does not define it
extern int a = 0; // defines "a" (initialisation requested), the extern is redundant
int a; // a tentative definition (no initialisation but "a" is zeroed)
Why is defining functions and data in a header file frowned upon? Because at link time, different units that have included the same header files will have the same symbols defined and the linker will see duplicate definitions of some symbols.
Also consider that a header is a kind of a "public" interface for the rest of the project (world?) and not every function that is defined in the source file needs to have a declaration there. It is perfectly fine to have internal types and static functions and data in the source file that never get exposed to the outside world.
Basically in header files, we can declare variables point to be noted only declaration is allowed there, do not define
let me clear.
int a=10; // definition
extern int a; //declaration - it can be used in another file if u include this header file.
you can also define the macro and declare the functions in header file.
Yes, header files may include variable declarations, but you generally don't want to do that because it will introduce maintenance headaches over time, especially as your code gets larger and more complex. Ideally, functions should share information through parameters and return values, not by using such "global" data items.
There are times when you can't avoid it; I haven't done any embedded programming, but my understanding is that using globals is fairly common in that domain due to space and performance constraints.
Ideally, headers should be limited to the following:
Macro definitions
Type definitions
Function declarations
But suppose you do create a header file with a variable declaration, like so:
/**
* foo.h
*/
int foo;
and you have several source files that all include that header1:
/**
* bar.c
*/
#include "foo.h"
void bar( void )
{
printf( "foo = %d\n", foo );
}
/**
* blurga.c
*/
#include "foo.h"
void blurga( void )
{
foo = 10;
}
/**
* main.c
*/
#include "foo.h"
int main( void )
{
foo = 5;
blurga();
bar();
return 0;
}
Each file will contain a declaration for foo at file scope (outside of any function). Now you compile each file separately
gcc -c bar.c
gcc -c blurga.c
gcc -c main.c
giving you three object files - bar.o, blurga.o, and main.o. Each of these object files will have their own unique copy of the foo variable. However, when we build them into a single executable with
gcc -o foo main.o bar.o blurga.o
the linker is smart enough to realize that those separate declarations of foo are meant to refer to the same object (the identifier foo has external linkage across those translation units). So the foo that main initializes to 5 is the same foo that blurga sets to 10, which is the same foo that bar prints out.
However, if you change the declaration of foo to
static int foo;
in foo.h and rebuild your files, then those separate declarations will not refer to the same object; they will remain three separate and distinct objects, such that the foo that main initializes is not the same foo that blurga sets to 10, which is not the same foo that bar prints out (foo has internal linkage within each translation unit).
If you must use a global variable between several translation units, my preferred style is to declare the variable in the header file as extern2
/**
* foo.h
*/
extern int foo;
and then define it in a corresponding .c file
/**
* foo.c
*/
int foo;
so only a single object file creates an instance of foo and it's crystal clear that you intend for other translation units to make use of it. The declaration in the header file isn't necessary for the variable to be shared (the foo identifier has external linkage by simple virtue of being declared in foo.c outside of any function and without the static keyword), but without it nobody else can be sure if you meant for it to be visible or if you just got sloppy.
Edit
Note that headers don't have to be included at the top of a file; you can be perverse and put an #include directive within a function body
void bar( void )
{
#include "foo.h"
// do stuff with foo
}
such that int foo; will be local to the function, although that will likely earn you a beating from your fellow programmers. I got to maintain code where somebody did that, and after 25 years it still gives me nightmares.
1. Please don't write code like this; it's only to illustrate the concept of linkage.
2. The extern keyword tells the compiler that the object the identifier refers to is defined somewhere else.

calling a function from a .h file [duplicate]

This question already has answers here:
How do I use extern to share variables between source files?
(19 answers)
Closed 8 years ago.
file1.c => includes file1.h
file1.h => has a struct:
typedef struct {
unsigned char *start;
unsigned int startInt;
}debugPrint;
file1.c => creates a struct object:
debugPrint dp;
file1.c => an int is given into struct:
dp.startInt = 10;
file1.c => has a function:
void function1(debugPrint dp) {
printf("%d", dp.startInt);
}
file2.h => has a function call to file1.c function which is declared before the call:
void function1(void);
function1();
Questions is:
Is it ok that the file2.h calls a function from file1.c
how can i pass the dp.startInt value to file2.h so that the value 10 that was set into dp.startInt in file1.c can be used in the funtion call in file2.h ?
It is needed to be called from file2.h since this file handles dynamic variable exchange between a html page and the file2.h file => data from file2.h function call via file1.c is sent to Html page. But i wont go more into the passing variable to html page since i don't know how it is made. It is a mechanism of openPicus web server example.
But if you know a good solution for this one. i would appreciate it. I'm not so familiar with this kind of code so that is also an issue here :)
But since i think this description is not good enough, here is the files:
file1.c:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include "test1.h"
// Define printStruct
void printStruct (debugPrint dp) {
printf("%u", dp.startInt);
}
int main ()
{
dp.startInt = 10;
getch();
}
file1.h:
typedef struct {
// For the motorEncoder value
unsigned char filename[20];
char ownMsg[10];
unsigned char *start;
unsigned char *timeInSeconds;
unsigned char *distanceInCm;
unsigned char *numberOfShots;
unsigned char *shutterCompensation;
unsigned char *direction;
unsigned char *cancel;
unsigned char *dutyCycle;
unsigned int cancelInt;
unsigned int startInt;
unsigned int dutyCycleInt;
unsigned int directionInt;
}debugPrint;
// Create struct object called dp
debugPrint dp;
// declare printStruct
void printStruct (debugPrint dp);
file2.h: (this file is totally needed to pass the dynamic values) I didn't put any includes since im not sure how i should include the .h files and from where i should include them.
// Call printStruct function
printStruct(dp);
part of file2.h actual code: (AND YES file2.h A HEADER FILE). FOR ME THIS SEEMS LIKE THE FUNCTIONS ARE FIRST DECLARED AND THEN ONE OF THEM IS USED IN THE H FILE => the HTTPPrint() function from where a function called HTTPPrint_stepCounter(); is called. That function is defined then in file1.c and it just prints some dynamic data to a http page. And as said this is how openPicus has done it and i am just trying to modify it.
void HTTPPrint(DWORD callbackID);
void HTTPPrint_led(WORD);
void HTTPPrint_stepCounter(void);
void HTTPPrint(DWORD callbackID)
{
switch(callbackID)
{
case 0x00000017:
HTTPPrint_led(0);
break;
case 0x00000059:
HTTPPrint_stepCounter();
break;
default:
// Output notification for undefined values
TCPPutROMArray(sktHTTP, (ROM BYTE*)"!DEF", 4);
}
return;
}
void HTTPPrint_(void)
{
TCPPut(sktHTTP, '~');
return;
}
Some tips for someone new to the C language:
There's an important difference between definition and declaration.
Definition is what actually creates the function or variable. Each function must be defined exactly once. Either in a *.c source file, or in a library.
Declaration creates an entry in the symbol table, that says the function or variable exists... somewhere... and here's its data type. Declarations can be duplicated without any effect.
We put function definitions in *.c source files. (And also in libraries, but that's an advanced build topic...)
We put public or extern function declarations in *.h header files.
We put shared extern variable declarations in *.h header files, so that other source units can share the same variable.
We put shared typedef structure declarations in *.h header files, so that other source units can share the same data type.
We do not put variable declarations in *.h header files if they aren't extern, or if they are initialized. The initial value belongs in the *.c file.
Function definitions usually don't belong in a *.h header file, because it's possible in a large project, that the header file could be included (read by the compiler) more than once. That would cause a compiler error, because then there would be more than one definition of that function. Even if it's literally a repeat of the same source code, there can be only one.
The quote about file2.h having a function call to file1.c function is not correct, function1(); could be either a declaration or a function call depending on context:
// declaration of a function named foo
void foo(void);
//
// declaration of a function named bar
// equivalent to declaring void bar(void);
bar();
//
// definition of a function named foo
void foo(void)
{
// call (or invoke) the function named bar
bar();
}
Another small point, about arrays: it's pretty strange to declare an array of one element debugPrint dp[1], since that declaration creates an object that will be referred to as dp[0]. This makes me think you may be trying to avoid the use of pointers... it would be more straightforward to just declare debugPrint dp and then the object is referred to as dp. Arrays make sense if you have more than one related object of the same type, but for just one object, it's a pretty unusual usage.
C is a very flexible programming language that gives free access to lots of low-level tricks. Both a blessing and a curse... For someone just getting started with the language, it's important to read other people's code examples as much as you can, to help learn how things are usually done. There are lots of extremely clever ways to use the language (e.g. Duff's Device) but in most cases, you're better off sticking with the most straightforward and customary way of solving the problem.
See also: What is the difference between a definition and a declaration?
You claim that file2.h contains:
void function(void);
function1();
But these lines refer to two different functions.
This problem is now fixed; both names are function1
If the function1(); appears outside any function, it is a (very sloppy) function declaration, not a function call. If it is inside some function, what is that function definition doing inside the header file. It would need to be an inline function to have much legitimacy.
The problem below is now fixed; the types are consistent.
Additionally, you say: an integer is given into struct: dp[1].startInt = 10;. The compiler complains that you shouldn't assign integers to pointers (since startInt is declared as a pointer, not an int). You need to get your code to compile without such complaints.
... There are two versions of the structure defined, one at the top of the question where startInt is an unsigned int *startInt; and one later on where the declaration is unsigned int startInt. Please make your question self-consistent! ...
This problem has been fixed now; dp is a simple structure.
Also note that you created debugPrint dp[1]; so your initialization is trampling out of bounds; the maximum valid index for the array is 0.
If code in file2.c needs to access the internals of the structure type declared in file1.h, the header file1.h should be included in file2.c. You can declare your dp array in the header too. A header should include other headers only if the functions it defines expose types defined in the other headers. For example, if the structure defined in file1.h included a FILE *db_fp; element, then file1.h should #include <stdio.h> to ensure that the code in file1.h would compile regardless of what else the code using file1.h includes.

Why is use of an array defined in File1 working in File2 (only declared there),even without "extern"?

Here I have two files externdemo1.c and externdemo2.c.In the first file,I have declared and initialized a character array arr at file scope.But I have declared it in the second file externdemo2.c without the extern keyword and made use of it there in the function display(). Here are my confusions arising from it.Please answer these three:
//File No.1--externdemo1.c
#include<stdio.h>
#include "externdemo2.c"
extern int display();
char arr[3]={'3','4','7'};
//extern char arr[3]={'3','4','7'};
//extern int main()
int main()
{
printf("%d",display());
}
//File No.2--externdemo2.c
char arr[3];
int display()
{
return sizeof(arr);
}
1) Why does the program compile fine even though I have declared arr without the extern keyword in externdemo2.c?I had read that the default linkage of functions is external,but I am not sure if that's so even for variables.I only know that global variables have extern storage class.
2) What is the rigorous difference between extern storage class and extern linkage.I badly need a clarification about this.In the first file,where I have defined the array arr,I haven't used the keyword extern, but I know that it has extern storage class by default.But in the second file, isn't there any default extern ,storage class or linkage,about the global variable arr,ie, in externdemo2.c?
3) Check the commented out line in the first file externdemo1.c.Just to test it, I had used the line extern char arr[3]={'3','4','7'};.But it gives the error 'arr' initialized and declared 'extern'.What does this error mean? I have also mentioned a commented line extern int main(),but it works fine without error or warning.So why can we use extern for a function even though a function is extern by default,but not for a variable,like arr here?
Please take some time to bail me out over this.It will clear most of my lingering doubts about the whole extern thing.It will be immense help if you can answer all 3 bits 1),2) and 3). Especially 3) is eating my brains out
Main questions
Basically, because you've included the source of externdemo2.c in the file externdemo1.c.
This is the big question. Because there is no initializer, the line char arr[3]; in externdemo2.c generates a tentative definition of the array arr. When the actual definition with initialization is encountered, the tentative definition is no longer tentative — but neither is it a duplicate definition.
Regarding extern storage class vs extern linkage...Linkage refers to whether a symbol can be seen from outside the source file in which it is defined. A symbol with extern linkage can be accessed by name by other source files in which it is appropriately declared. To the extent it is defined, extern storage class means 'stored outside of the scope of a function', so independent of any function. The variable defined with exern storage class might or might not have extern linkage.
Because it is not defined with the keyword static, the array arr has extern linkage; it is a global variable.
With the commented out line uncommented out, you have two definitions of one array, which is not allowed.
I observe that you must be compiling just externdemo1.c to create a program — the compiler is including the code from externdemo2.c because it is directly included. You can create an object file from externdemo2.c. However, you cannot create a program by linking the object files from both externdemo1.c and externdemo2.c because that would lead to multiple definitions of the function display().
Auxilliary questions
I have placed both files in the [same directory]. If I don't include the second file in the first, then when I compile the first file it gives the error undefined reference to display. Since I have used extern for that function in the first file, isn't the linker supposed to link to it even if I don't include the second file? Or the linker looks for it only in default folders?
There are a couple of confusions here. Let's try dealing with them one at a time.
Linking
The linker (usually launched by the compiler) will link the object files and libraries that are specified on its command line. If you want two object files, call them externdemo1.obj and externdemo2.obj, linked together, you must tell the linker (via the build system in the IDE) that it needs to process both object files — as well as any libraries that it doesn't pick up by default. (The Standard C library, plus the platform-specific extensions, are normally picked up automatically, unless you go out of your way to stop that happening.)
The linker is not obliged to spend any time looking for stray object files that might satisfy references; indeed, it is expected to link only those object files and libraries that it is told to link and not add others at its whim. There are some caveats about libraries (the linker might add some libraries not mentioned on the command line if one of the libraries it is told to link with has references built into it to other libraries), but the linker doesn't add extra object files to the mix.
C++ with template instantiation might be argued to be a bit different, but it is actually following much the same rules.
Source code
You should have a header, externdemo.h, that contains:
#ifndef EXTERNDEMO_H_INCLUDED
#define EXTERNDEMO_H_INCLUDED
extern int display(void);
extern char arr[3]; // Or extern char arr[]; -- but NOT extern char *arr;
#endif /* EXTERNDEMO_H_INCLUDED */
You should then modify the source files to include the header:
//File No.1--externdemo1.c
#include <stdio.h>
#include "externdemo.h"
char arr[3] = { '3', '4', '7' };
int main(void)
{
printf("%d\n", display());
return 0;
}
and:
//File No.2--externdemo2.c
#include "externdemo.h"
int display(void)
{
return sizeof(arr);
}
The only tricky issue here is 'does externdemo2.c really know the size of arr?' The answer is 'Yes' (at least using GCC 4.7.1 on Mac OS X 10.8.3). However, if the extern declaration in the header did not include the size (extern char arr[];), you would get compilation errors such as:
externdemo2.c: In function ‘display’:
externdemo2.c:7:18: error: invalid application of ‘sizeof’ to incomplete type ‘char[]’
externdemo2.c:8:1: warning: control reaches end of non-void function [-Wreturn-type]
Your program looks a bit err. To me the #include "externdemo2.c" line appears invalid.
Following is the correction I have made and it works.
//File No.1--externdemo1.c
#include <stdio.h>
extern char arr[3];
extern int display();
int main()
{
printf("%d", arr[0]);
printf("%d",display());
}
//File No.2--externdemo2.c
char arr[3]={'3','4','7'};
int display()
{
return sizeof(arr);
}
Please follow the below links for better understanding:
Effects of the extern keyword on C functions
How do I use extern to share variables between source files?
Using #include as shown will make both as one file only. You can check the intermediate file with flag -E, as in:
gcc -E externdemo1.c

Resources