Linkage of extern block scope variable, C - c

C standard says:
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
What is not clear is whether previous identifier to be considered must have the same type (note: C++ standard explicitly says "entity with the same name and type"). For example:
static int a; // internal linkage
void f()
{
float a; // no linkage, instead of 'int a' we have 'float a'
{
extern int a; // still external linkage? or internal in this case?
a = 0; // still unresolved external?
}
}
I tried to test it with different compilers but it seems that linkage subject is not the one with great solidarity.

C uses flat name space for all its globals. Unlike C++, which requires the linker to pay attention to the type of your global variables (look up name mangling for more info on this), C puts this requirement onto programmers.
It is an error to re-declare a variable with a different type when changing the linkage inside the same translation unit.
I will use your example with a small addition
static int a; // internal linkage
static int b; // internal linkage
void f()
{
float a = 123.25; // this variable shadows static int a
int b = 321; // this variable shadows static int b
{ // Open a new scope, so the line below is not an illegal re-declaration
// The declarations below "un-shadow" static a and b
extern int a; // redeclares "a" from the top, "a" remains internal
extern int b; // redeclares "b" from the top, "b" remains internal
a = 42; // not an unresolved external, it's the top "a"
b = 52; // not an unresolved external, it's the top "b"
printf("%d %d\n", a, b); // static int a, static int b
}
printf("%f %d\n", a, b); // local float a, int b
}
This example prints
42 52
123.250000 321
When you change the type across multiple translation units, C++ will catch it at the time of linking, while C will link fine, but produce undefined behavior.

I think I have an answer. I will write down on linkage subject in general.
C standard says:
In the set of translation units each declaration of a particular
identifier with external linkage denotes the same entity (object or function).
Within one translation unit, each declaration of an identifier with
internal linkage denotes the same entity.
C++ standard says:
When a name has external linkage, the entity it denotes can be
referred to by names from scopes of other translation units or from
other scopes of the same translation unit. When a name has internal
linkage, the entity it denotes can be referred to by names from other
scopes in the same translation unit.
This has two implications:
In the set of translation units we cannot have multiple distinct external entities with the same name (save for overloaded functions in C++), so the types of each declaration that denotes that single external entity should agree. We can check if types agree within one translation unit, this is done at compile-time. We cannot check if types agree between different translation units neither at compile-time nor at link-time.
Technically in C++ we can violate in the set of translation units we cannot have multiple distinct external entities with the same name rule without function overloading. Since C++ has name mangling that encodes type information it is possible to have multiple external entities with the same name and different types. For example:
file-one.cpp:
int a; // C decorated name: _a
// C++ decorated name (VC++): ?a##3HA
//------------------------------------------------
file-two.cpp:
float a; // C decorated name: _a
// C++ decorated name (VC++): ?a##3MA
Whereas in C this will really be one external entity, code in first unit will treat it as int and code in second unit will treat it as float.
In one translation unit we cannot have multiple distinct internal entities with the same name (save for overloaded functions in C++), so the types of each declaration that denotes that single internal entity should agree. We check if types agree within translation unit, this is done at compile-time.
Now we will move closer to the question.
C++ standard says:
The name of a function declared in block scope and the name of a
variable declared by a block scope extern declaration have linkage. If
there is a visible declaration of an entity with linkage having the
same name and type, ignoring entities declared outside the innermost
enclosing namespace scope, the block scope declaration declares that
same entity and receives the linkage of the previous declaration. If
there is more than one such matching entity, the program is
ill-formed. Otherwise, if no matching entity is found, the block scope
entity receives external linkage.
// C++
int a; // external linkage
void f()
{
extern float a; // external linkage
}
Here we do not have previous declaration of entity with the same name (a) and type (float) so the linkage of extern float a is external. Since we already have int a with external linkage in this translation unit and name is the same, types should agree. In this case they don't, hence we have compile-time error.
// C++
static int a; // internal linkage
void f()
{
extern float a; // external linkage
}
Here we do not have previous declaration of entity with the same name (a) and type (float) so the linkage of extern float a is external. It means that we have to define float a in another translation unit. Note that we have the same identifier with external and internal linkage within one translation unit (I don't know why C considers this undefined behavior since we can have internal and external entity with the same name in different translation units).
// C++ (example from standard)
static int a; // internal linkage
void f()
{
int a; // no linkage
{
extern int a; // external linkage
}
}
Here the previous declaration int a has no linkage, so extern int a has external linkage. It means that we have to define int a in another translation unit.
C standard says:
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
So we can see that in C only name is considered (without type).
// C
int a; // external linkage
void f()
{
extern float a; // external linkage
}
Here the previous declaration of identifier a has external linkage, so the linkage of extern float a is the same (external). Since we already have int a with external linkage in this translation unit and name is the same, types should agree. In this case they don't, hence we have compile-time error.
// C
static int a; // internal linkage
void f()
{
extern float a; // internal linkage
}
Here the previous declaration of identifier a has internal linkage, so the linkage of extern float a is the same (internal). Since we already have static int a with internal linkage in this translation unit and name is the same, types should agree. In this case they don't, hence we have compile-time error. Whereas in C++ this code is fine (I think type match requirement was added with function overloading in mind).
// C
static int a; // internal linkage
void f()
{
int a; // no linkage
{
extern int a; // external linkage
}
}
Here the previous declaration of identifier a has no linkage, so extern int a has external linkage. It means that we have to define int a in another translation unit. However GCC decided to reject this code with variable previously declared 'static' redeclared 'extern' error, probably because we have undefined behavior according to C standard.

Related

Why does an explicit "extern" not allocate storage for an object?

I've been diving deeper into the C standard, and I'm confused about the way it talks about linkage and tentative definitions.
First, in this part of the standard it is stated that
extern (keyword) means static duration and external linkage (unless already declared internal)
static storage duration. The storage duration is the entire execution of the program, and the value
stored in the object is initialized only once, prior to main function. All objects declared static and
all objects with either internal or external linkage that aren't declared _Thread_local (since C11)
have this storage duration.
external linkage. The identifier can be referred to from any other translation units in the entire
program. All non-static functions, all extern variables (unless earlier declared static), and all file-
scope non-static variables have this linkage.
so far we have that variables declared in file scope have static storage duration and external linkage by default. Also, objects with static storage duration are initialized to zero, before the program starts.
But, after reading this part (tentative definitions) and this part (declarations) I can't find where it says that objects with an explicit "extern" keyword are not allocated storage.
Please be careful about the difference between the "extern" keyword itself and the term "external declarations".
"External declarations" are defined as
At the top level of a translation unit (that is, a source file with all the #includes after the preprocessor), every C program is a sequence of declarations, which declare functions and objects with external linkage. These declarations are known as external declarations because they appear outside of any function.
regardless of the presence or absence of an explicit "extern" keyword.
I suppose that my concrete question is where in the standard does it say that file scope objects, that have an implicit external linkage by default, are not allocated storage if they are declared with an explicit "extern".
I know this is the case because if one declares the same identifier in multiple translation units all but one must have "extern" so as not to get a redefinition error.
First, while cppreference.com has useful information it is not the C standard. The C11 standard can be found here.
This comes down to the difference between a declaration and a definition.
For an object, a declaration basically states that an object with a given type exists somewhere, while a definition is what actually allocates space for the object.
These terms are specified in sectin 6.7p5 of the C standard:
A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for
that identifier that:
for an object, causes storage to be reserved for that object;
for a function, includes the function body;
for an enumeration constant, is the (only) declaration of the identifier;
for a typedef name, is the first (or only) declaration of the identifier.
By applying the extern keyword, if there is no initializer then this constitutes a declaration, and a declaration does not allocate storage for an object. Section 6.9.2p1-2 spells this out:
1 If the declaration of an identifier for an object has file scope and an initializer, the declaration is an external
definition for the identifier.
2 A declaration of an identifier for an object that has file
scope without an initializer, and without a storage-class specifier
or with the storage-class specifier static, constitutes a tentative
definition. If a translation unit contains one or more tentative
definitions for an identifier, and the translation unit contains
no external definition for that identifier, then the behavior
is exactly as if the translation unit contains a file scope
declaration of that identifier, with the composite type as of the
end of the translation unit, with an initializer equal to 0.
A declaration with extern and no initializer does not fit the above definition of a tentative definition or an external definition.
Section 6.9.2p4 gives examples of declarations and definitions:
int i1 = 1; //definition, external linkage
static int i2 = 2; //definition, internal linkage
extern int i3 = 3; //definition, external linkage
int i4; //tentative definition, external linkage
static int i5; //tentative definition, internal linkage
int i1; //valid tentative definition, refers to previous
int i2; //6.2.2 renders undefined, linkage disagreement
int i3; //valid tentative definition, refers to previous
int i4; //valid tentative definition, refers to previous
int i5; //6.2.2 renders undefined, linkage disagreement
extern int i1; //refers to previous, whose linkage is external
extern int i2; //refers to previous, whose linkage is internal
extern int i3; //refers to previous, whose linkage is external
extern int i4; //refers to previous, whose linkage is external
extern int i5; //refers to previous, whose linkage is internal
In the C Standard (6.9.2 External object definitions ) there is written that
1 If the declaration of an identifier for an object has file scope and
an initializer, the declaration is an external definition for the
identifier.
So if you will write at file scope
extern int x = 1;
then this declaration with the storage-class specifier extern will be at the same time a definition of the object x.
Otherwise if an object is declared at file scope without an initializer but with the storage-class specifier extern then the compiler assumes that the object is defined in some other translation unit or in the same translation unit but somewhere else.
For example (here is declared a variable at file scope with internal linkage)
#include <stdio.h>
static int x = 10;
extern int x;
int main(void)
{
printf( "x = %d\n", x );
return 0;
}
If an object is declared at file scope without the storage-class specifier extern then the compiler generates a tentative definition.

What will be the value stored in a variable when it is initialised with extern keyword when it is already declared as a global variable?

When we declare a global variable, it is initialised to its default value. But when we initialise a variable using the extern keyword, why does the variable retain the value with which it is initialised using the extern keyword?
For instance, in the code below why is the output 9 and not a compile time error? Since there is no external linkage of variable x from any other source file, so x has two copies and we are initialising the variable twice so this should be an error. Please clarify this; I am a bit confused in the flow of this code.
#include <stdio.h>
extern int x=9;
int x;
int main()
{
printf("%d",x);
return 0;
}
extern int x = 9; means the same as int x = 9;. The extern keyword has no effect for a definition that already has external linkage and an initializer.
int x; is called a tentative definition.
This is described well by C11 6.9.2/2:
A declaration of an identifier for an object that has file scope without an initializer, and
without a storage-class specifier or with the storage-class specifier static, constitutes a
tentative definition. If a translation unit contains one or more tentative definitions for an
identifier, and the translation unit contains no external definition for that identifier, then
the behavior is exactly as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation unit, with an initializer
equal to 0.
This translation unit does contain an external definition for x, so the tentative definition has no effect. It doesn't matter whether the external definition is before or after the tentative definition.
"external definition" means a non-tentative definition at file scope -- not to be confused with extern or "external linkage", although in your example x does happen to have external linkage.
So your code is exactly the same as:
int x = 9;

Why does this code not generate a redeclaration Error?

Here is an extern and a static variable with same name. The output prints the static variable a=10. Why is there no syntax error and how would I access extern a if needed?
#include<stdio.h>
extern int a;
static int a=10;
main()
{
printf("%d\n",a);
}
The C standard allows the opposite, extern after static:
6.2.2 Linkages of identifiers....
3 If the declaration of a file scope identifier for an object or a function contains the storage-class
specifier static, the identifier has internal linkage.
4 For an identifier declared with the storage-class specifier extern in a scope in which a
prior declaration of that identifier is visible, if the prior declaration specifies internal or
external linkage, the linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is visible, or if the prior
declaration specifies no linkage, then the identifier has external linkage.
At the same time it states:
7 If, within a translation unit, the same identifier appears with both internal and external
linkage, the behavior is undefined.
BTW, the C++ standard makes it explicit:
7.1.1 Storage class specifiers....
static int b; // b has internal linkage
extern int b; // b still has internal linkage
....
extern int d; // d has external linkage
static int d; // error: inconsistent linkage

External, internal and no linkage or why this does not work?

According to C standard:
In the set of translation units and libraries that constitutes an entire program, each
declaration of a particular identifier with
external linkage
denotes the same object or
function. Within one translation unit, each declaration of an identifier with
internal
linkage
denotes the same object or function. Each declaration of an identifier with
no
linkage
denotes a unique entity.
In my example we have three separate declarations with each identifier having a different linkage.So why doesn't this work?
static int a; //a_Internal
int main(void) {
int a; //a_Local
{
extern int a; //a_External
}
return 0;
}
Error:
In function 'main':
Line 9: error: variable previously declared 'static' redeclared 'extern'
Why does compiler insist that I'm redeclaring instead of trying to access external object in another file?
Valid C++ example for reference:
static void f();
static int i = 0; // #1
void g() {
extern void f(); // internal linkage
int i; // #2 i has no linkage
{
extern void f(); // internal linkage
extern int i; // #3 external linkage
}
}
Both Clang and VC seem to be okay with my C example; only some versions of GCC (not all) produce the aforementioned error.
§6.2.2, 7 says:
If, within a translation unit, the same identifier appears with both
internal and external linkage, the behavior is undefined.
So, your program has undefined behaviour.
§6.2.2, 4 says that
extern int a; //a_External
has external linkage because the prior declaration visible in the scope int a; //a_Local has no linkage. But
static int a; //a_Internal
declares a with internal linkage. Hence, it's undefined per §6.2.2, 7.
The compiler is giving this error because inside the a_External scope, a_Internal is still accessible, thus you are redeclaring a_Internal from static to extern in a_External because of the name collision of a. This problem can be solved by using different variable names, for example:
static int a1; //a_Internal
int main(void) {
int a2; //a_Local
{
extern int a3; //a_External
}
return 0;
}
C standard says:
In the set of translation units each declaration of a particular
identifier with external linkage denotes the same entity (object or
function). Within one translation unit, each declaration of an
identifier with internal linkage denotes the same entity.
In the set of translation units we cannot have multiple distinct external entities with the same name, so the types of each declaration that denotes that single external entity should agree. We can check if types agree within one translation unit, this is done at compile-time. We cannot check if types agree between different translation units neither at compile-time nor at link-time.
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
static int a; //a_Internal
int main(void) {
int a; //No linkage
{
extern int a; //a_External
}
return 0;
}
Here the previous declaration of identifier a has no linkage, so extern int a has external linkage. It means that we have to define int a in another translation unit. However GCC decided to reject this code with variable previously declared static redeclared 'extern' error, probably because we have undefined behavior according to C standard.

External variable declaration and definition

a)The definition of an external variable is same as that of a local variable,ie, int i=2; (only outside all functions).
But why is extern int i=2; too working as the definition? Isn't extern used only in variable declaration in other files?
b)file 1
#include<stdio.h>
int i=3;
int main()
{
printf("%d",i);
fn();
}
file2
int i; // although the declaration should be: extern int i; So, why is this working?
void fn()
{
printf("%d",i);
}
OUTPUT: 3 in both cases
For historical reasons, the rules for determination of linkage and when a declaration provides a definition are a bit of a mess.
For your particular example, at file scope
extern int i = 2;
and
int i = 2;
are equivalent external definitions, ie extern is optional if you provide an initializer.
However, if you do not provide an initializer, extern is not optional:
int i;
is a tentative definition with external linkage, which becomes an external definition equivalent to
int i = 0;
if the translation unit doesn't contain another definition with explicit initializer.
This is different from
extern int i;
which is never a definition. If there already is another declaration of the same identifier visible, then the variable will get its linkage from that; if this is the first declaration, the variable will have external linkage.
This means that in your second example, both file1 and file2 provide an external definition of i, which is undefined behaviour, and the linker is free to choose the definition it likes best (it may also try to make demons fly out of your nose). There's a common extension to C (see C99 Annex J.5.11 and this question) which makes this particular case well-defined.
In C, an extern with an initialization results in a variable being allocated. That is the declaration will be considered a defining declaration. This is in contrast with the more common use of extern.
The C standard says:
6.9.2 External object definitions
.....
If the declaration of an identifier for an object has file scope and an initializer, the
declaration is an external definition for the identifier.
As for the second part of your question, your declaration int i at file scope has external linkage. If you want to give it internal linkage you need to declare it static int i. The C standard says:
6.2.2 Linkages of identifiers
......
If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.

Resources