Is extern keyword in C redundant? - c

I have found that I could achieve the desired results without using extern (though I do agree that it gives reader some kind of an hint about the variable). In some cases using extern gave undesired results.
xyz.h
int i;
file1.c
#include "xyz.h"
....
i=10;
....
file2.c
#include "xyz.h"
main()
{
printf("i=%d\n",i);
}
Of course, it was a large project, broke it down for simple understanding. With extern keyword, I couldnt get desired results. In fact, I got linker error for the variable i with "extern" approach.
Code with "extern" approach,
file1.c
int i;
main()
{
i=10;
}
file2.c
extern int i;
foo()
{
printf("i=%d\n",i);
}
This gave linker error. I just wanted to know why it worked in the first case and also the practical case where we cannot do it without using the keyword "extern". Thanks.

Formally, your first program is invalid. Defining a variable in header file and then including this header file into multiple translation units will ultimately result in multiple definitions of the same entity with external linkage. This is a constraint violation in C.
6.9 External definitions
5 An external definition is an external declaration that is also a
definition of a function (other than an inline definition) or an
object. If an identifier declared with external linkage is used in an
expression (other than as part of the operand of a sizeof operator
whose result is an integer constant), somewhere in the entire program
there shall be exactly one external definition for the identifier;
otherwise, there shall be no more than one.
The definition of i in your first example is a tentative definition (as it has been mentioned in the comments), but it turns into a regular full fledged external definition of i at the end of each translation unit that includes the header file. So, the "tentativeness" of that definition does not change anything from the "whole program" point of view. It is not really germane to the matter at hand (aside for a little remark below).
What makes your first example to compile without error is a popular compiler extension, which is even mentioned as such in the language standard.
J.5 Common extensions
J.5.11 Multiple external definitions
1 There may be more than one
external definition for the identifier of an object, with or without
the explicit use of the keyword extern; if the definitions disagree,
or more than one is initialized, the behavior is undefined (6.9.2).
(It is quite possible that what originally led to that compiler extension in C is some implementational peculiarities of tentative definition support, but at abstract language level tentative definitions have nothing to do with this.)
Your second program is valid with regard to i (BTW, implicit int is no longer supported in C). I don't see how you could get any linker errors from it.

There are at least 2 cases where extern is meaningful and not "redundant":
For objects (not functions) at file scope, it declares the object with external linkage without providing a tentative definition; tentative definitions turn into full definitions at the end of a translation unit, and having the same identifier defined with external linkage in multiple translation units is not permitted.
At block scope (in a function), extern allows you to declare and access an object or function with external linkage without bringing the identifier into file scope (so it's not visible outside the block with the declaration). This is useful if the name could conflict with something else at file scope. For example:
static int a;
int foo(void)
{
return a;
}
int bar(void)
{
extern int a;
return a;
}
Without the extern keyword in bar, int a; would yield a local variable (automatic storage).

Related

Can a variable be both static and extern [duplicate]

Why the following doesn't compile?
...
extern int i;
static int i;
...
but if you reverse the order, it compiles fine.
...
static int i;
extern int i;
...
What is going on here?
This is specifically given as an example in the C++ standard when it's discussing the intricacies of declaring external or internal linkage. It's in section 7.1.1.7, which has this exert:
static int b ; // b has internal linkage
extern int b ; // b still has internal linkage
extern int d ; // d has external linkage
static int d ; // error: inconsistent linkage
Section 3.5.6 discusses how extern should behave in this case.
What's happening is this: static int i (in this case) is a definition, where the static indicates that i has internal linkage. When extern occurs after the static the compiler sees that the symbol already exists and accepts that it already has internal linkage and carries on. Which is why your second example compiles.
The extern on the other hand is a declaration, it implicitly states that the symbol has external linkage but doesn't actually create anything. Since there's no i in your first example the compiler registers i as having external linkage but when it gets to your static it finds the incompatible statement that it has internal linkage and gives an error.
In other words it's because declarations are 'softer' than definitions. For example, you could declare the same thing multiple times without error, but you can only define it once.
Whether this is the same in C, I do not know (but netcoder's answer below informs us that the C standard contains the same requirement).
For C, quoting the standard, in C11 6.2.2: Linkage of identifiers:
3) If the declaration of a file scope identifier for an object or a function contains the storage-class specifier static, the identifier has internal linkage.
4) For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
(emphasis-mine)
That explains the second example (i will have internal linkage). As for the first one, I'm pretty sure it's undefined behavior:
7) If, within a translation unit, the same identifier appears with both internal and external linkage, the behavior is undefined.
...because extern appears before the identifier is declared with internal linkage, 6.2.2/4 does not apply. As such, i has both internal and external linkage, so it's UB.
If the compiler issues a diagnostic, well lucky you I guess. It could compile both without errors and still be compliant to the standard.
C++:
7.1.1 Storage class specifiers [dcl.stc]
7) A name declared in a namespace scope without a storage-class-specifier has external linkage unless it has
internal linkage because of a previous declaration and provided it is not declared const. Objects declared
const and not explicitly declared extern have internal linkage.
So, the first one attempts to first gives i external linkage, and internal afterwards.
The second one gives it internal linkage first, and the second line doesn't attempt to give it external linkage because it was previously declared as internal.
8) The linkages implied by successive declarations for a given entity shall agree. That is, within a given scope,
each declaration declaring the same variable name or the same overloading of a function name shall imply
the same linkage. Each function in a given set of overloaded functions can have a different linkage, however.
[ Example:
[...]
static int b; // b has internal linkage
extern int b; // b still has internal linkage
[...]
extern int d; // d has external linkage
static int d; // error: inconsistent linkage
[...]
In Microsoft Visual Studio, both versions compile just fine.
On Gnu C++ you get an error.
I'm not sure which compiler is "correct". Either way, having both lines doesn't make much sense.
extern int i means that the integer i is defined in some other module (object file or library). This is a declaration. The compiler will not allocate storage the i in this object, but it will recognize the variable when you are using it somewhere else in the program.
int i tells the compiler to allocate storage for i. This is a definition. If other C++ (or C) files have int i, the linker will complain, that int i is defined twice.
static int i is similar to the above, with the extra functionality that i is local. It cannot be accessed from other module, even if they declare extern int i. People are using the keyword static (in this context) to keep i localize.
Hence having i both declared as being defined somewhere else, AND defined as static within the module seems like an error. Visual Studio is silent about it, and g++ is silent only in a specific order, but either way you just shouldn't have both lines in the same source code.

Shall an variable with external linkage be defined with keyword external?

From https://www.quora.com/What-are-the-types-of-linkages-in-C-programming
External linkage, means that the variable could be defined somewhere
else outside the file you are working on, which means you can define
it inside any other translation unit rather your current one (you
will have to use the keyword extern when defining it in the other
source code).
Internal linkage, means that the variable must be defined in your
translation unit scope, which means it should either be defined in any
of the included libraries, or in the same file scope.
None linkage, points to the default functions and braces scopes, such
as defining an auto variable inside a function, this will make the
variable only accessable within that function's scope.
Note that:
Any global object is externally linked by default, you can disable that by using the keyword static.
Any constant global object is internally linked by default, you can
disable that by using the keyword extern.
Suppose a global variable is defined in file1 and I want to use it in file2.
In file1, shall the global variable be defined with keyword
extern?
The quote above seem to contradict itself:
"you will have to use the keyword extern when defining it in the other source code" seems to say it should.
"Any global object is externally linked by default" seems to say that it doesn't need to.
In file2, shall I declare global variable with keyword extern?
tldr: You should declare the variable, with extern, in a header file which is included by both file1 and file2. You should then define the variable in file1 only. To define the variable, you declare it without extern (and also without static).
You sound like you have gotten the extern keyword and the standardese term "external linkage" mixed up.
A variable with "external linkage" is accessible from any function in the program, as long as a declaration of the variable is visible to that function. In contrast, a variable with "internal linkage" is accessible from, at most, one "translation unit" (a single source file and all the files it includes), and a variable with "no linkage" is only visible within a single function. (I don't remember off the top of my head whether a static variable declared within a function is considered to have internal linkage or no linkage.)
The extern keyword, applied to a variable declaration, has two effects. First, it guarantees that that variable will be given external linkage, even if the declaration is inside a function (don't do that). Second, and much more importantly, it makes the declaration not be a definition. What that means is, if you have these two files
/* file1.c */
extern int foo;
/* file2.c */
extern int foo;
int main(void) { return foo; }
their combination is not a valid program. You will get an error from the linker, probably reading something like "undefined reference to foo".
To make this a valid program, you must remove the extern from one of the two declarations of foo. That declaration then becomes a definition, and the program will link. (Also, that declaration can then be given an initializer, if you want it to have a value other than 0 at startup.)
If you remove the extern from both of the definitions of foo, the result is, IIRC, implementation-defined. Some C compilers will merge them into a single global variable, and others will issue a link-time error. Sometimes the behavior depends on whether or not the variables have initializers.
It is OK to write
/* file1.c */
extern int foo;
/* file2.c */
extern int foo;
int foo;
This allows you to put the extern declaration in a header file that both .c files include, which reduces the risk of the declarations coming to have inconsistent types (if this happens, the program will exhibit undefined behavior).
c11 draft, 6.2.2, 4th section, states:
For an identifier declared with the storage-class specifier extern in a scope in which a
prior declaration of that identifier is visible,31) if the prior declaration specifies internal or
external linkage, the linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is visible, or if the prior
declaration specifies no linkage, then the identifier has external linkage.
i don't think there will be any difference with or without extern keyword. some experiment also indicates that:
following 2 compiled:
extern int a;
extern int a;
int a = 20;
and
extern int a;
int a;
int a = 20;
but not this one:
extern int a;
int a = 10;
int a = 20;
however, i have some impression that once i read about the different behavior when dynamic linkage get involved in windows. I cannot find it though. it would be nice if anyone can confirm.
The text you quoted has a lot of errors. I would recommend not relying on information from that site.
(you will have to use the keyword extern when defining it in the other source code).
Not true, as file-scope definitions have external linkage unless the static keyword is used (or no specifier is used and they are re-declaring something already declared with the static keyword).
It is possible to include a redundant extern in the definition, however there must also be an initializer (otherwise it would be a declaration and not a definition).
Internal linkage, means that the variable must be defined in your translation unit scope, which means it should either be defined in any of the included libraries, or in the same file scope.
Presumably this means "included headers", not "included libraries".
Any constant global object is internally linked by default, you can disable that by using the keyword extern.
This is wrong, "global objects" have external linkage unless declared with static. The author might be mixing up C with C++ (in the latter, const global objects have internal linkage unless otherwise specified).
Suppose a global variable is defined in file1 and I want to use it in file2.
// file1.h (or any other header)
extern object_t obj;
// file1.c
#include "file1.h"
object_t obj; // or: extern object_t obj = { 1, 2, 3 };
// file2.c
#include "file1.h"

if global variable have default external linkage then why can't we access it directly in another file?

I have gone through following questions:
Global variable in C are static or not?
Are the global variables extern by default or it is equivalent to declaring variable with extern in global?
Above links describe that if we define global variable in one file and haven't specified extern keyword they will be accessible in another source file because of translation unit.
Now I have file1.c in that have defined following global variable and function:
int testVariable;
void testFunction()
{
printf ("Value of testVariable %d \n", testVariable);
}
In file2.c have following code
void main()
{
testVariable = 40;
testFunction();
}
Now I am getting error: 'testVariable' undeclared (first use in this function) -- why?
Note: both files are used in same program using makefile.
As per my understanding both function and global variable have default external linkage. So function we can use directly by it's name in another file but variable can't why?
Can any one have idea?
EDIT:
From the below answer i get idea that like in case of function old compiler will guess and add an implicit declaration but in case of variable it can't. Also C99 removed implicit declaration but still I am getting warning in C99 mode like:
warning: implicit declaration of function ‘testFunction’.
Now have gone through below link:
implicit int and implicit declaration of functions with gcc compiler
It said that compiler take it as diagnostic purpose and not give error. So compiler can process forward.
But why in case of variable it can't process further? Even in case of function if compiler proceed and if actual definition not there then at linking time it will fail. So what's benefit to move forward??
There are two things in play here: The first is that there is a difference between a definition and a declaration. The other thing is the concept of translation units.
A definition is what defines the variable, it's the actual place the variable exists, where the compiler reserves space for the variable.
A declaration is needed for the compiler to know that a symbol exists somewhere in the program. Without a declaration the compiler will not know that a symbol exists.
A translation unit is basically and very simplified the source file plus all its included header files. An object file is a single translation unit, and the linker takes all translation units to create the final program.
Now, a program can only have a single definition, for example a global variable may only exist in a single translation unit, or you will get multiple definition errors when linking. A declaration on the other hand can exist in any number of translation units, the compiler will use it to tell the linker that the translation references a definition in another (unknown at time of compilation) translation unit.
So what happens here is that you have a definition and a declaration in file1.c. This source file is used as input for one translation unit and the compiler generates a single object file for it, say file1.o. In the other source file, file2.c, there is no definition nor any declaration of the global variable testVariable, so the compiler doesn't know it exists and will give you an error for it. You need to declare it, for example by doing
extern int testVariable; // This is a declaration of the variable
It's a little more complicated for the function, because in older versions of the C standard one didn't have to declare functions being used, the compiler would guess and add an implicit declaration. If the definition and the implicit declaration doesn't match it will lead to undefined behavior, which is why implicit function declarations was removed in the C99 standard. So you should really declare the function too:
void testFunction(void); // Declare a function prototype
Note that the extern keyword is not needed here, because the compiler can automatically tell that it's a function prototype declaration.
The complete file2.c should then look like
extern int testVariable; // This is a declaration of the variable
void testFunction(void); // Declare a function prototype
void main()
{
testVariable = 40;
testFunction();
}
When compiler copes with file2.c it knows nothing about existence of testVariable and about its type. And as result it can't generate code to interact with such object. And purpose of line like
extern int testVariable;
is to let compiler to know that somewhere such object exists and has type of int.
With functions we have no such problem because of next rule - if function is not defined - compiler assumes that it is defined somewhere like
int testFunction() { ... }
So you can pass any number of any arguments to it and try to obtain int return value. But if real function signature differs - you'll get an undefined behafiour at runtime. Because of this weakness such approach is considered as bad practice, and you should declare proper function prototype before any call to that func.

Declaration and definition in C programming with extern

This may seem simple to one's eye but, this question is itching me in many ways.
my question is about declaration and defenition on variables in c.
there are actually many explanation in internet regarding this one and there is not just one solution to this issue as many view points are placed in this issue. i want to know the clear existance of this issue.
int a;
just take this is this a declaration or definition?, this one when i use printf, it has 0 as value and address as 2335860. but if this declaration then how come memory is allocated for this.
int a;
int a;
when i do this it says previous declaration of 'a' was here and redeclaration of 'a' with no linkage.
some sources say redeclaration is permitted in c and some say dont what is the truth?
int a; just take this is this a declaration or definition?
int a; if written in global scope is a tentative definition. Which means if no other definitions are available in current compilation unit, treat this as definition or else this is a declaration.
From 6.9.2 External object definitions in C11 specs:
A declaration of an identifier for an object that has file scope
without an initializer, and without a storage-class specifier or with
the storage-class specifier static, constitutes a tentative
definition. If a translation unit contains one or more tentative
definitions for an identifier, and the translation unit contains no
external definition for that identifier, then the behavior is exactly
as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation
unit, with an initializer equal to 0.
int i4; // tentative definition, external linkage
static int i5; // tentative definition, internal linkage
So you are effectively doing multiple declarations but getting the address and value because of tentative definition rule.
some sources say redeclaration is permitted in c and some say dont
what is the truth?
Redeclaration is permitted in C. But redefinition is not.
Related question: What is the difference between a definition and a declaration?
there are actually many explanation in internet
Prefer a good book instead of internet to get the hold of language. You can choose a good book from:
The Definitive C Book Guide and List
int a is a definition and can be used in place of declaration. A variable can have many declarations but must have only one definition. In case of
int a;
int a;
there are two definition of a in the same scope. Providing linkage to one of them will make your compiler happy
int a;
extern int a;

Why do we need the 'extern' keyword in C if file scope declarations have external linkage by default?

AFAIK, any declaration of a variable or a function in file scope has external linkage by default. static mean "it has internal linkage", extern -- "it maybe defined elsewhere", not "it has external linkage".
If so, why we need extern keyword? In other words, what is difference between int foo; and extern int foo; (file scope)?
The extern keyword is used primarily for variable declarations. When you forward-declare a function, the keyword is optional.
The keyword lets the compiler distinguish a forward declaration of a global variable from a definition of a variable:
extern double xyz; // Declares xyz without defining it
If you keep this declaration by itself and then use xyz in your code, you would trigger an "undefined symbol" error during the linking phase.
double xyz; // Declares and defines xyz
If you keep this declaration in a header file and use it from several C/C++ files, you would trigger a "multiple definitions" error during the linking phase.
The solution is to use extern in the header, and not use extern in exactly one C or C++ file.
As an illustration, compile the following program: (using cc -c program.c , or the equivalent)
extern char bogus[0x12345678] ;
Now remove the "extern" keyword, and compile again:
char bogus[0x12345678] ="1";
Run objdump (or the equivalent) on the two objects.
You will find that without the extern keyword space is actually allocated.
With the extern keyword the whole "bogus" thing is only a reference. You are saying to the compiler: "there must be a char bogus[xxx] somewhere, fix it up!"
Without the extern keyword you say: "I need space for a variable char bogus[xxx], give me that space!"
The confusing thing is that the actual allocation of memory for an object is postponed until link time: the compiler just adds a record to the object, informing the linker that an object should (or should not) be allocated. In all cases, the compiler at leasts will add the name (and size) of the object, so the linker/loader can fix it up.
C99 standard
I am going to repeat what others said but by quoting and interpreting the C99 N1256 draft.
First I confirm your assertion that external linkage is the default for file scope 6.2.2/5 "Linkages of identifiers":
If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
The confusion point is that extern does not only alter the linkage, but also weather an object declaration is a definition or not. This matters because 6.9/5 "External definitions" says there can only be one external definition:
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than
one.
where "external definition" is defined by the grammar snippet:
translation-unit:
external-declaration
so it means a "file scope" top-level declaration.
Then 6.9.2/2 "External object definitions" says (object means "data of a variable"):
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
So:
extern int i;
is not a definition, because it does have a storage-class specifier: extern.
However:
int i;
does not have a storage-class specifier, so it is a tentative definition. And if there are no more external declarations for i, then we can add the initializer equal 0 = 0 implicitly:
int i = 0;
So if we had multiple int i; in different files, the linker should in theory blow up with multiple definitions.
GCC 4.8 does not comply however, and as an extension allows multiple int i; across different files as mentioned at: https://stackoverflow.com/a/3692486/895245 .
This is implemented in ELF with a common symbol, and this extension is so common that it is mentioned in the standard at J.5.11/5 Common extensions > Multiple external definitions:
There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).
Another place where extern has an effect is in block-scope declarations, see: Can local and register variables be declared extern?
If there is an initializer for the object declaration, extern has no effect:
extern int i = 0;
equals
int i = 0;
Both are definitions.
For functions, extern seems to have no effect: Effects of the extern keyword on C functions as there is no analogous concept of tentative definition.
You can only define a variable once.
If multiple files use the same variable then the variable must be redundantly declared in each file. If you do a simple "int foo;" you'll get a duplicate definition error. Use "extern" to avoid a duplicate definition error. Extern is like saying to the compiler "hey, this variable exists but don't create it. it's defined somewhere else".
The build process in C is not "smart". It won't search through all the files to see if a variable exists. You must explicitly say that the variable exists in the current file, but at the same time avoid creating it twice.
Even in the same file, the build process is not very smart. It goes top to bottom and it won't recognize a function name if it is defined below the point of use, so you must declare it higher up.

Resources