Declaration vs definition: is GCC wrong? - c

According to ISO9899:2017 § 6.7-5:
A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for that
identifier that: — for an object, causes storage to be reserved for
that object;
I guess it’s the exact same with all versions of the C standard.
When I try to compile the following code with GCC:
int main(void)
{ extern int myVariable; // Declaration of myVariable.
int myVariable; // Definition of myVariable.
int myVariable; // Definition of myVariable.
}
I get the following error:
error: redeclaration of 'myVariable' with no linkage
Unless I’m mistaken, the error isn't rather a redefinition?

First you have this:
extern int myVariable;
The extern keyword makes this a declaration for a variable with external linkage. Any such declaration will always refer to the same object.
Then you have:
int myVariable;
This identifier has block scope (i.e. is defined inside of a function) and therefore has no linkage. Because it has no linkage, such a declaration is also a definition.
This is also an error (although you didn't show it) because you have conflicting declarations in the same scope for the same identifier, one with external linkage and one with no linkage.
Then in the same scope you have:
int myVariable;
This is an error because you have two objects with the same name and no linkage in the same scope, and therefore multiple definitions.
Section 6.2.2p2 of the C standard describes linkages in more detail:
In the set of translation units and libraries that constitutes an
entire program, each declaration of a particular identifier with
external linkage denotes the same object or function. Within one
translation unit, each declaration of an identifier with internal
linkage denotes the same object or function. Each declaration of an
identifier with no linkage denotes a unique entity.

Related

Why does an explicit "extern" not allocate storage for an object?

I've been diving deeper into the C standard, and I'm confused about the way it talks about linkage and tentative definitions.
First, in this part of the standard it is stated that
extern (keyword) means static duration and external linkage (unless already declared internal)
static storage duration. The storage duration is the entire execution of the program, and the value
stored in the object is initialized only once, prior to main function. All objects declared static and
all objects with either internal or external linkage that aren't declared _Thread_local (since C11)
have this storage duration.
external linkage. The identifier can be referred to from any other translation units in the entire
program. All non-static functions, all extern variables (unless earlier declared static), and all file-
scope non-static variables have this linkage.
so far we have that variables declared in file scope have static storage duration and external linkage by default. Also, objects with static storage duration are initialized to zero, before the program starts.
But, after reading this part (tentative definitions) and this part (declarations) I can't find where it says that objects with an explicit "extern" keyword are not allocated storage.
Please be careful about the difference between the "extern" keyword itself and the term "external declarations".
"External declarations" are defined as
At the top level of a translation unit (that is, a source file with all the #includes after the preprocessor), every C program is a sequence of declarations, which declare functions and objects with external linkage. These declarations are known as external declarations because they appear outside of any function.
regardless of the presence or absence of an explicit "extern" keyword.
I suppose that my concrete question is where in the standard does it say that file scope objects, that have an implicit external linkage by default, are not allocated storage if they are declared with an explicit "extern".
I know this is the case because if one declares the same identifier in multiple translation units all but one must have "extern" so as not to get a redefinition error.
First, while cppreference.com has useful information it is not the C standard. The C11 standard can be found here.
This comes down to the difference between a declaration and a definition.
For an object, a declaration basically states that an object with a given type exists somewhere, while a definition is what actually allocates space for the object.
These terms are specified in sectin 6.7p5 of the C standard:
A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for
that identifier that:
for an object, causes storage to be reserved for that object;
for a function, includes the function body;
for an enumeration constant, is the (only) declaration of the identifier;
for a typedef name, is the first (or only) declaration of the identifier.
By applying the extern keyword, if there is no initializer then this constitutes a declaration, and a declaration does not allocate storage for an object. Section 6.9.2p1-2 spells this out:
1 If the declaration of an identifier for an object has file scope and an initializer, the declaration is an external
definition for the identifier.
2 A declaration of an identifier for an object that has file
scope without an initializer, and without a storage-class specifier
or with the storage-class specifier static, constitutes a tentative
definition. If a translation unit contains one or more tentative
definitions for an identifier, and the translation unit contains
no external definition for that identifier, then the behavior
is exactly as if the translation unit contains a file scope
declaration of that identifier, with the composite type as of the
end of the translation unit, with an initializer equal to 0.
A declaration with extern and no initializer does not fit the above definition of a tentative definition or an external definition.
Section 6.9.2p4 gives examples of declarations and definitions:
int i1 = 1; //definition, external linkage
static int i2 = 2; //definition, internal linkage
extern int i3 = 3; //definition, external linkage
int i4; //tentative definition, external linkage
static int i5; //tentative definition, internal linkage
int i1; //valid tentative definition, refers to previous
int i2; //6.2.2 renders undefined, linkage disagreement
int i3; //valid tentative definition, refers to previous
int i4; //valid tentative definition, refers to previous
int i5; //6.2.2 renders undefined, linkage disagreement
extern int i1; //refers to previous, whose linkage is external
extern int i2; //refers to previous, whose linkage is internal
extern int i3; //refers to previous, whose linkage is external
extern int i4; //refers to previous, whose linkage is external
extern int i5; //refers to previous, whose linkage is internal
In the C Standard (6.9.2 External object definitions ) there is written that
1 If the declaration of an identifier for an object has file scope and
an initializer, the declaration is an external definition for the
identifier.
So if you will write at file scope
extern int x = 1;
then this declaration with the storage-class specifier extern will be at the same time a definition of the object x.
Otherwise if an object is declared at file scope without an initializer but with the storage-class specifier extern then the compiler assumes that the object is defined in some other translation unit or in the same translation unit but somewhere else.
For example (here is declared a variable at file scope with internal linkage)
#include <stdio.h>
static int x = 10;
extern int x;
int main(void)
{
printf( "x = %d\n", x );
return 0;
}
If an object is declared at file scope without the storage-class specifier extern then the compiler generates a tentative definition.

Why do enumeration constants have no linkage?

I'm trying to understand linkage of enumeration constants and could not find a clear answer in the Standard N1570. 6.2.2(p6):
The following identifiers have no linkage: an identifier declared to
be anything other than an object or a function; an identifier declared
to be a function parameter; a block scope identifier for an object
declared without the storage-class specifier extern.
So I need to understand that constants are not objects. Object is defined as 3.15:
region of data storage in the execution environment, the contents of
which can represent values
Also 6.2.2(p4) (emphasize mine):
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
Anyway 6.4.4.3(p2):
An identifier declared as an enumeration constant has type int.
Combining all that I don't understand why
enum test {
a = 1
};
extern int a; //compile-error. UB?
does not compile? I expected a to have external linkage.
LIVE DEMO
Is the behavior well-defined? Can you provide a reference to the Standard explaining that?
An identifier declared as an enumeration constant has type int
that doesn't means it is a variable of type int
but
extern int a;
says there is a variable of type int named a, this is a conflict with the enumeration constant
Why does not enumeration constant have no linkage
for the same reason the constant 123 (also having type int, but whatever) has no linkage too
In 6.2.2 4, the standard intends to discuss linkage only for identifiers of objects and functions, but it fails to make this clear.
Enumeration constants are mere values, not objects or functions, and their identifiers never have any linkage.
Observe the declaration extern int a; declares a as an identifier for an int object. An int object is a different thing from an int value, so an enumeration constant named a cannot be the same thing as an int object named a. So the declaration of extern int a; is invalid even before linkage is considered.
Linkage does not matter here. In the same compilation unit you try to have two same identifiers Imagine if the code compiles:
enum test {
a = 1
};
extern int a;
int b = a; // which `a`? a as the external variable or `a` as a constant? How to decide.

Linkage of extern block scope variable, C

C standard says:
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
What is not clear is whether previous identifier to be considered must have the same type (note: C++ standard explicitly says "entity with the same name and type"). For example:
static int a; // internal linkage
void f()
{
float a; // no linkage, instead of 'int a' we have 'float a'
{
extern int a; // still external linkage? or internal in this case?
a = 0; // still unresolved external?
}
}
I tried to test it with different compilers but it seems that linkage subject is not the one with great solidarity.
C uses flat name space for all its globals. Unlike C++, which requires the linker to pay attention to the type of your global variables (look up name mangling for more info on this), C puts this requirement onto programmers.
It is an error to re-declare a variable with a different type when changing the linkage inside the same translation unit.
I will use your example with a small addition
static int a; // internal linkage
static int b; // internal linkage
void f()
{
float a = 123.25; // this variable shadows static int a
int b = 321; // this variable shadows static int b
{ // Open a new scope, so the line below is not an illegal re-declaration
// The declarations below "un-shadow" static a and b
extern int a; // redeclares "a" from the top, "a" remains internal
extern int b; // redeclares "b" from the top, "b" remains internal
a = 42; // not an unresolved external, it's the top "a"
b = 52; // not an unresolved external, it's the top "b"
printf("%d %d\n", a, b); // static int a, static int b
}
printf("%f %d\n", a, b); // local float a, int b
}
This example prints
42 52
123.250000 321
When you change the type across multiple translation units, C++ will catch it at the time of linking, while C will link fine, but produce undefined behavior.
I think I have an answer. I will write down on linkage subject in general.
C standard says:
In the set of translation units each declaration of a particular
identifier with external linkage denotes the same entity (object or function).
Within one translation unit, each declaration of an identifier with
internal linkage denotes the same entity.
C++ standard says:
When a name has external linkage, the entity it denotes can be
referred to by names from scopes of other translation units or from
other scopes of the same translation unit. When a name has internal
linkage, the entity it denotes can be referred to by names from other
scopes in the same translation unit.
This has two implications:
In the set of translation units we cannot have multiple distinct external entities with the same name (save for overloaded functions in C++), so the types of each declaration that denotes that single external entity should agree. We can check if types agree within one translation unit, this is done at compile-time. We cannot check if types agree between different translation units neither at compile-time nor at link-time.
Technically in C++ we can violate in the set of translation units we cannot have multiple distinct external entities with the same name rule without function overloading. Since C++ has name mangling that encodes type information it is possible to have multiple external entities with the same name and different types. For example:
file-one.cpp:
int a; // C decorated name: _a
// C++ decorated name (VC++): ?a##3HA
//------------------------------------------------
file-two.cpp:
float a; // C decorated name: _a
// C++ decorated name (VC++): ?a##3MA
Whereas in C this will really be one external entity, code in first unit will treat it as int and code in second unit will treat it as float.
In one translation unit we cannot have multiple distinct internal entities with the same name (save for overloaded functions in C++), so the types of each declaration that denotes that single internal entity should agree. We check if types agree within translation unit, this is done at compile-time.
Now we will move closer to the question.
C++ standard says:
The name of a function declared in block scope and the name of a
variable declared by a block scope extern declaration have linkage. If
there is a visible declaration of an entity with linkage having the
same name and type, ignoring entities declared outside the innermost
enclosing namespace scope, the block scope declaration declares that
same entity and receives the linkage of the previous declaration. If
there is more than one such matching entity, the program is
ill-formed. Otherwise, if no matching entity is found, the block scope
entity receives external linkage.
// C++
int a; // external linkage
void f()
{
extern float a; // external linkage
}
Here we do not have previous declaration of entity with the same name (a) and type (float) so the linkage of extern float a is external. Since we already have int a with external linkage in this translation unit and name is the same, types should agree. In this case they don't, hence we have compile-time error.
// C++
static int a; // internal linkage
void f()
{
extern float a; // external linkage
}
Here we do not have previous declaration of entity with the same name (a) and type (float) so the linkage of extern float a is external. It means that we have to define float a in another translation unit. Note that we have the same identifier with external and internal linkage within one translation unit (I don't know why C considers this undefined behavior since we can have internal and external entity with the same name in different translation units).
// C++ (example from standard)
static int a; // internal linkage
void f()
{
int a; // no linkage
{
extern int a; // external linkage
}
}
Here the previous declaration int a has no linkage, so extern int a has external linkage. It means that we have to define int a in another translation unit.
C standard says:
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
So we can see that in C only name is considered (without type).
// C
int a; // external linkage
void f()
{
extern float a; // external linkage
}
Here the previous declaration of identifier a has external linkage, so the linkage of extern float a is the same (external). Since we already have int a with external linkage in this translation unit and name is the same, types should agree. In this case they don't, hence we have compile-time error.
// C
static int a; // internal linkage
void f()
{
extern float a; // internal linkage
}
Here the previous declaration of identifier a has internal linkage, so the linkage of extern float a is the same (internal). Since we already have static int a with internal linkage in this translation unit and name is the same, types should agree. In this case they don't, hence we have compile-time error. Whereas in C++ this code is fine (I think type match requirement was added with function overloading in mind).
// C
static int a; // internal linkage
void f()
{
int a; // no linkage
{
extern int a; // external linkage
}
}
Here the previous declaration of identifier a has no linkage, so extern int a has external linkage. It means that we have to define int a in another translation unit. However GCC decided to reject this code with variable previously declared 'static' redeclared 'extern' error, probably because we have undefined behavior according to C standard.

Clarification on Scope and Redefinition [duplicate]

This question already has answers here:
Is it possible that a variable declared after the main has file scope?
(3 answers)
Closed 8 years ago.
Referring to the code below:
#include <stdio.h>
int a;
int a;
int main()
{
int b;
int b;
return 0;
}
Why does the compiler (GCC) complain of redeclaration for only variable 'b' and not 'a'?
redef.c: In function 'main':
redef.c:19: error: redeclaration of 'b' with no linkage
redef.c:18: error: previous declaration of 'b' was here
It's because a has external linkage and the standard states (C11, 6.2.2/2):
An identifier declared in different scopes or in the same scope more than once can be made to refer to the same object or function by a process called linkage. There are three kinds of linkage: external, internal, and none.
In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function. Each declaration of an identifier with no linkage denotes a unique entity.
So, because a has external linkage, both those declarations refer to the same underlying variable. Because b has no linkage, the declaration refer to unique variables and therefore conflict with each other.
Quoting the C99 standard §6.9.2 ¶2
A declaration of an identifier for an object that has file scope
without an initializer, and without a storage-class specifier or with
the storage-class specifier static, constitutes a tentative
definition. If a translation unit contains one or more tentative
definitions for an identifier, and the translation unit contains no
external definition for that identifier, then the behavior is exactly
as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation
unit, with an initializer equal to 0.
Therefore, both the statements
int a;
int a;
constitute tentative definitions. According to the above quoted part, the behaviour is as if
the two statements were replaced by
int a = 0;
However, b defined inside main is an automatic variable, i.e., it has automatic storage allocation. There cannot be two definitions of an automatic variable.
int a; int a; is a tentative definition. From 6.9.2p2:
A declaration of an identifier for an object that has file scope without an initializer, and
without a storage-class specifier or with the storage-class specifier static, constitutes a
tentative definition. If a translation unit contains one or more tentative definitions for an
identifier, and the translation unit contains no external definition for that identifier, then
the behavior is exactly as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation unit, with an initializer
equal to 0.
Tentative definitions are only permitted at file scope.
The reason that int b; int b; is illegal is because of 6.7p3:
If an identifier has no linkage, there shall be no more than one declaration of the identifier
(in a declarator or type specifier) with the same scope and in the same name space
Identifiers declared within a function and not static or extern have no linkage, this is describd in 6.2.2p6
The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.

Why is stdlib.h full of extern function prototypes and gcc discrepancy about this

I am aware about C linking rules presented in the following excerpts from C standard:
1/ An identifier declared in different scopes or in the same scope
more than once can be made to refer to the same object or function by
a process called linkage. There are three kinds of linkage: external,
internal, and none.
2/ In the set of translation units and libraries that constitutes an
entire program, each declaration of a particular identifier with
external linkage denotes the same object or function. Within one
translation unit, each declaration of an identifier with internal
linkage denotes the same object or function. Each declaration of an
identifier with no linkage denotes a unique entity.
3/ If the declaration of a file scope identifier for an object or a
function contains the storage-class specifier static, the identifier
has internal linkage.
4/ For an identifier declared with the storage-class specifier extern
in a scope in which a prior declaration of that identifier is visible,
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
5/ If the declaration of an identifier for a function has no
storage-class specifier, its linkage is determined exactly as if it
were declared with the storage-class specifier extern. If the
declaration of an identifier for an object has file scope and no
storage-class specifier, its linkage is external.
6/ The following identifiers have no linkage: an identifier declared
to be anything other than an object or a function; an identifier
declared to be a function parameter; a block scope identifier for an
object declared without the storage-class specifier extern.
7/ If, within a translation unit, the same identifier appears with
both internal and external linkage, the behavior is undefined.
I understand that extern keyword is optional before functions declarations because they are external by default but there are some functions prototypes preceded by extern in stdlib.h such as:
extern void qsort (void *__base, size_t __nmemb, size_t __size,
__compar_fn_t __compar) __nonnull ((1, 4));
Also, why gcc handles situations described in point 7 differently when it comes to functions and variables. In this example both function foo and variable d are defined both in internal and external scope but only variable definition raises error:
static int foo(void);
int foo(void); /* legal */
static double d;
double d; /* illegal */
One can freely place or not place extern before function declaration, so it should not be surprising that one can found it somewhere. Regarding second question:
C11 draft (n1570.pdf) has example in page 159 related to tentative definitions:
static int i5; // tentative definition, internal linkage
// ...
int i5; // 6.2.2 renders undefined, linkage disagreement
extern int i5; // refers to previous, internal linkage
6.2.2 is what you have posted. So, it does not work in this case because there are two tentative definitions with different linkages, so there is p.7 violation. On the other hand, it works with external specifier (as foo functions from your example), because p.4 is enforce - later declaration refers to linkage defined in first declaration. In other words, case with variables does not work because they are objects and tentative definition rules are involved. At least standard contains explicit example which clearly explains what comittee wanted to say.

Resources