I thought externs were to share variables between compilation units. Why does the below code work ? and how does it work exactly ? Is this good practice ?
#include <stdio.h>
int x = 50;
int main(void){
int x = 10;
printf("Value of local x is %d\n", x);
{
extern int x;
printf("Value of global x is %d\n", x);
}
return 0;
}
Prints out :
Value of local x is 10
Value of global x is 50
When you use the extern keyword, the linker finds a symbol with a matching name in object files / libraries / archives. Symbols are, simply speaking, functions and global variables (local variables are just some space on the stack), thus the linker can do it's magic here.
About it being a good practice - global variables in general are not considered a good practice since they cause spaghetti code and 'pollute' the symbols pool.
You might (or might not) be interested to know that GCC (4.9.1) and clang (Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)) have divergent views on the acceptability of the following code, which is a minor adaptation of the code in the question:
#include <stdio.h>
static int x = 50; // static instead of no storage class specifier
int main(void)
{
int x = 10;
printf("Value of local x is %d\n", x);
{
extern int x;
printf("Value of global x is %d\n", x);
}
return 0;
}
I called the source file ext.c.
$ clang -O3 -g -std=c11 -Wall -Wextra -Werror ext.c -o ext
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror ext.c -o ext
ext.c: In function ‘main’:
ext.c:9:20: error: variable previously declared ‘static’ redeclared ‘extern’
extern int x;
^
ext.c: At top level:
ext.c:2:12: error: ‘x’ defined but not used [-Werror=unused-variable]
static int x = 50;
^
cc1: all warnings being treated as errors
$
The problem is to determine which compiler is correct because they can't both be right unless the program is exhibiting undefined behaviour — which, if you bother to read to the end, will turn out to be the case.
The relevant section of the C11 standard is:
6.2.2 Linkages of identifiers
¶1 An identifier declared in different scopes or in the same scope more than once can be
made to refer to the same object or function by a process called linkage.29) There are
three kinds of linkage: external, internal, and none.
¶2 In the set of translation units and libraries that constitutes an entire program, each
declaration of a particular identifier with external linkage denotes the same object or
function. Within one translation unit, each declaration of an identifier with internal
linkage denotes the same object or function. Each declaration of an identifier with no
linkage denotes a unique entity.
¶3 If the declaration of a file scope identifier for an object or a function contains the storage class
specifier static, the identifier has internal linkage.30)
This means that the first or outermost declaration (definition) of x in the code above has internal linkage.
4 For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,31) if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
This paragraph needs detailed deconstruction below.
¶5 If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
In the original code in the question, the second sentence says that the first declaration (definition) of x has external linkage.
¶6 The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.
The x declared (defined) at the start of the function is 'a block scope identifier …' and therefore has no linkage.
¶7 If, within a translation unit, the same identifier appears with both internal and external
linkage, the behavior is undefined.
29) There is no linkage between different identifiers.
30) A function declaration can contain the storage-class specifier static only if it is at file scope; see
6.7.1.
31) As specified in 6.2.1, the later declaration might hide the prior declaration.
Dissecting paragraph 4
Paragraph 4 is the key one here. Restating it and annotating it:
4 For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,31)
The third or innermost declaration of x is declared in a scope in which a prior declaration of that identifier is visible — the int x = 10; declaration is visible (the static int x = 50; declaration is invisible, having been shadowed by the visible declaration). The footnote refers to §6.2.1 Scopes of identifiers but I don't think ithat says anything surprising (however, I'll quote the relevant paragraphs — ¶2 and ¶4 — if you think that's necessary).
if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration.
This does not apply; the prior declaration specifies neither internal nor external linkage.
If no prior declaration is visible, or if the prior declaration specifies no linkage,
There is a prior declaration that's visible, and that declaration specifies no linkage.
then the identifier has external linkage.
So, the innermost x has external linkage, the outermost x has internal linkage, and as a consequence, paragraph 7 says the resulting behaviour is undefined. That means that both compilers are correct; if the behaviour is undefined, any behaviour is correct — and different compilers are allowed to have divergent views on what is correct, and GCC and clang exhibit divergent views. On the whole, GCC's "it is a problem that should be reported" view is safer for the programmer.
In the original code, the outermost x has external linkage, the innermost x also has external linkage, and as a consequence paragraph 7 does not apply, and the innermost declaration of x refers to the outermost declaration (and definition) of x.
Apart from showing that interpreting the standard is hard work, this whole answer (diatribe) also shows that using multiple compilers (if possible on different platforms) is a good idea. It gives you the maximum chance of finding problems. Depending on a single compiler leaves you vulnerable to missing problems that another compiler might spot.
Related
What should the third x refer to in:
#include <stdio.h>
static char x = '1';
int main(void)
{
char x = '2';
{
extern char x;
printf("%c\n", x);
}
}
This arose in this answer, and:
In Apple LLVM 9.1.0 clang-902-0.39.2, the x of extern char x refers to the first x, and “1” is printed.
GCC 8.2 does not accept this source text., complaining: “error: variable previously declared 'static' redeclared 'extern'”.
C 2018 6.2.2 4 says:
For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
Since there are two prior declarations of x, the condition of each of the following “if” clauses is true, the first for the first prior declaration, and the second for the second prior declaration:
… if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration.
… if the prior declaration specifies no linkage, then the identifier has external linkage.
Clang’s behavior here is consistent with using the first clause, so that the third x has internal linkage and refers to the same object as the first x. GCC’s behavior here is consistent with using the second clause, so that the third x has external linkage and conflicts with the first x, which has internal linkage.
Does the C standard give us a way to resolve which of these should be the case?
The third declaration, extern char x, should declare x with external linkage, based on C 2018 6.2.2 4, which says:
For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
At the declaration extern char x, the first declaration of x is not visible, as it has been hidden by the second declaration. Therefore, it does not qualify for “a prior declaration of that identifier is visible.” The second declaration of x is visible, so it is a “prior declaration” for the purposes of the above paragraph.
Then the last sentence should control: The prior declaration specifies no linkage (6.2.2 6, a block-scope identifier without extern has no linkage), so the third x has external linkage.
Then 6.2.2 7 is violated because the first x has internal linkage and the third x has external linkage:
If, within a translation unit, the same identifier appears with both internal and external linkage, the behavior is undefined.
Since no syntax rule or constraint is violated, the C implementation is not required by the standard to report a diagnostic. Since the behavior is undefined, it may do anything, including accept this code and make the third x refer to the same object as the first x. Therefore, neither Clang nor GCC’s behaviors violate the standard in this regard. However, since 6.2.2 7 is violated, a diagnostic may be preferred, and its absence could be consider a defect of Clang.
(Credit to Paul Ogilvie and T.C. for informing my thinking on this with their comments.)
The following compiles fine, using static only during declaration of function:
#include <stdio.h>
static int a();
int a(){
return 5;
}
int main(){
printf("%d\n", a());
return 0;
}
As a side note, same behaviour as above happens with inline functions, i.e only the declaration could have the keyword.
However the following fails, doing the same but on a variable:
#include <stdio.h>
static int a;
int a = 5;
int main(){
printf("%d\n", a);
return 0;
}
Getting thew error:
non-static declaration of 'a' follows static declaration.
What is with the difference?
This quote from the C Standard shows the difference )6.2.2 Linkages of identifiers)
5 If the declaration of an identifier for a function has no
storage-class specifier, its linkage is determined exactly as if it
were declared with the storage-class specifier extern. If the
declaration of an identifier for an object has file scope and no
storage-class specifier, its linkage is external.
So a function looks like it has the implicit storage specifier extern (but it does not mean that it has the external linkage opposite to an object identifier that in this case has the external linkage).
Now according to the following quote
4 For an identifier declared with the storage-class specifier extern
in a scope in which a prior declaration of that identifier is
visible,31) if the prior declaration specifies internal or external
linkage, the linkage of the identifier at the later declaration is the
same as the linkage specified at the prior declaration. If no prior
declaration is visible, or if the prior declaration specifies no
linkage, then the identifier has external linkage
So the function has the internal linkage due to its initial declaration with the storage specifier static.
As for the identifier of a variable then
7 If, within a translation unit, the same identifier appears with both
internal and external linkage, the behavior is undefined.
The resume from the above cited quotes is the following. If a function has no explicitly specified storage class specifier extern then its linkage is determined by a prior function declaration (if such a declaration exists). As for an identifier of object then in this case it has the external linkage. And if there is a prior declaration of the identifier with the internal linkage then the behavior is undefined.
I'm trying to understand linkage of enumeration constants and could not find a clear answer in the Standard N1570. 6.2.2(p6):
The following identifiers have no linkage: an identifier declared to
be anything other than an object or a function; an identifier declared
to be a function parameter; a block scope identifier for an object
declared without the storage-class specifier extern.
So I need to understand that constants are not objects. Object is defined as 3.15:
region of data storage in the execution environment, the contents of
which can represent values
Also 6.2.2(p4) (emphasize mine):
For an identifier declared with the storage-class specifier extern in
a scope in which a prior declaration of that identifier is visible,31)
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
Anyway 6.4.4.3(p2):
An identifier declared as an enumeration constant has type int.
Combining all that I don't understand why
enum test {
a = 1
};
extern int a; //compile-error. UB?
does not compile? I expected a to have external linkage.
LIVE DEMO
Is the behavior well-defined? Can you provide a reference to the Standard explaining that?
An identifier declared as an enumeration constant has type int
that doesn't means it is a variable of type int
but
extern int a;
says there is a variable of type int named a, this is a conflict with the enumeration constant
Why does not enumeration constant have no linkage
for the same reason the constant 123 (also having type int, but whatever) has no linkage too
In 6.2.2 4, the standard intends to discuss linkage only for identifiers of objects and functions, but it fails to make this clear.
Enumeration constants are mere values, not objects or functions, and their identifiers never have any linkage.
Observe the declaration extern int a; declares a as an identifier for an int object. An int object is a different thing from an int value, so an enumeration constant named a cannot be the same thing as an int object named a. So the declaration of extern int a; is invalid even before linkage is considered.
Linkage does not matter here. In the same compilation unit you try to have two same identifiers Imagine if the code compiles:
enum test {
a = 1
};
extern int a;
int b = a; // which `a`? a as the external variable or `a` as a constant? How to decide.
I am wondering if the C snippet below, in which the definition of f fails to repeat that f is of static linkage, is correct:
static int f(int);
int f(int x) { return x; }
Clang does not emit any warning for it. I read clause 6.7.1 of the C11 standard without finding the answer to my question.
It is possible to imagine more questions along the same vein, for instance t1.c and t2.c below, and it would be nice if an answer was general enough to apply to some of these, but I am only really concerned about the first example above.
~ $ cat t1.c
static int f(int);
int f(int);
int f(int x) { return x; }
~ $ clang -c -std=c99 -pedantic t1.c
~ $ nm t1.o
warning: /Applications/Xcode.app/…/bin/nm: no name list
~ $ cat t2.c
int f(int);
static int f(int);
int f(int x) { return x; }
~ $ clang -c -std=c99 -pedantic t2.c
t2.c:3:12: error: static declaration of 'f' follows non-static declaration
static int f(int);
^
t2.c:1:5: note: previous declaration is here
int f(int);
^
1 error generated.
The rules for linkage are a little confusing, and it is different for functions and objects. In short, the rules are as follows:
The first declaration determines the linkage.
static means internal linkage.
extern means linkage as already declared, if none is declared, external.
If neither of them is given, it’s the same as extern for functions, and external linkage for object identifiers (with a definition in the same translation unit).
So, this is valid:
static int f(int); // Linkage of f is internal.
int f(int); // Same as next line.
extern int f(int); // Linkage as declared before, thus internal.
int f(int x) { return x; }
This, on the other hand, is undefined behaviour (cf. C11 (n1570) 6.2.2 p7):
int f(int); // Same as if extern was given, no declaration visible,
// so linkage is external.
static int f(int); // UB, already declared with external linkage.
int f(int x) { return x; } // Would be fine if either of the above
// declarations was removed.
Most of this is covered in C11 6.2.2. From the N1570 draft:
(3) If the declaration of a file scope identifier for an object or a function contains the storage-class specifier static, the identifier has internal linkage. 30)
(4) For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible31), if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
(5) If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
30) A function declaration can contain the storage-class specifier static only if it is at file scope; see 6.7.1.
31) As specified in 6.2.1, the later declaration might hide the prior declaration.
According to C11, 6.2.2, 7 they are all undefined behaviours.
If, within a translation unit, the same identifier appears with both
internal and external linkage, the behavior is undefined.
A function is also an identifier and a function by default (without any qualifier like static) has external linkage.
C11, 6.2.1 Scopes of identifiers
1 An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a
typedef name; a label name; a macro name; or a macro parameter. The
same identifier can denote different entities at different points in
the program. A member of an enumeration is called an enumeration
constant. Macro names and macro parameters are not considered further
here, because prior to the semantic phase of program translation any
occurrences of macro names in the source file are replaced by the
preprocessing token sequences that constitute their macro definitions.
I am aware about C linking rules presented in the following excerpts from C standard:
1/ An identifier declared in different scopes or in the same scope
more than once can be made to refer to the same object or function by
a process called linkage. There are three kinds of linkage: external,
internal, and none.
2/ In the set of translation units and libraries that constitutes an
entire program, each declaration of a particular identifier with
external linkage denotes the same object or function. Within one
translation unit, each declaration of an identifier with internal
linkage denotes the same object or function. Each declaration of an
identifier with no linkage denotes a unique entity.
3/ If the declaration of a file scope identifier for an object or a
function contains the storage-class specifier static, the identifier
has internal linkage.
4/ For an identifier declared with the storage-class specifier extern
in a scope in which a prior declaration of that identifier is visible,
if the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.
5/ If the declaration of an identifier for a function has no
storage-class specifier, its linkage is determined exactly as if it
were declared with the storage-class specifier extern. If the
declaration of an identifier for an object has file scope and no
storage-class specifier, its linkage is external.
6/ The following identifiers have no linkage: an identifier declared
to be anything other than an object or a function; an identifier
declared to be a function parameter; a block scope identifier for an
object declared without the storage-class specifier extern.
7/ If, within a translation unit, the same identifier appears with
both internal and external linkage, the behavior is undefined.
I understand that extern keyword is optional before functions declarations because they are external by default but there are some functions prototypes preceded by extern in stdlib.h such as:
extern void qsort (void *__base, size_t __nmemb, size_t __size,
__compar_fn_t __compar) __nonnull ((1, 4));
Also, why gcc handles situations described in point 7 differently when it comes to functions and variables. In this example both function foo and variable d are defined both in internal and external scope but only variable definition raises error:
static int foo(void);
int foo(void); /* legal */
static double d;
double d; /* illegal */
One can freely place or not place extern before function declaration, so it should not be surprising that one can found it somewhere. Regarding second question:
C11 draft (n1570.pdf) has example in page 159 related to tentative definitions:
static int i5; // tentative definition, internal linkage
// ...
int i5; // 6.2.2 renders undefined, linkage disagreement
extern int i5; // refers to previous, internal linkage
6.2.2 is what you have posted. So, it does not work in this case because there are two tentative definitions with different linkages, so there is p.7 violation. On the other hand, it works with external specifier (as foo functions from your example), because p.4 is enforce - later declaration refers to linkage defined in first declaration. In other words, case with variables does not work because they are objects and tentative definition rules are involved. At least standard contains explicit example which clearly explains what comittee wanted to say.