How to define extern variable along with declaration? - c

Wiki says:
The extern keyword means "declare without defining". In other words, it is a way to explicitly declare a variable, or to force a declaration without a definition. It is also possible to explicitly define a variable, i.e. to force a definition. It is done by assigning an initialization value to a variable.
That means, an extern declaration that initializes the variable serves as a definition for that variable. So,
/* Just for testing purpose only */
#include <stdio.h>
extern int y = 0;
int main(){
printf("%d\n", y);
return 0;
}
should be valid (compiled in C++11). But when compiled with options -Wall -Wextra -pedantic -std=c99 in GCC 4.7.2, produces a warning:
[Warning] 'y' initialized and declared 'extern' [enabled by default]
which should not. AFAIK,
extern int y = 0;
is effectively the same as
int i = 0;
What's going wrong here ?

All three versions of the standard — ISO/IEC 9899:1990, ISO/IEC 9899:1999 and ISO/IEC 9899:2011 — contain an example in the section with the title External object definitions (§6.7.2 of C90, and §6.9.2 of C99 and C11) which shows:
EXAMPLE 1
int i1 = 1; // definition, external linkage
static int i2 = 2; // definition, internal linkage
extern int i3 = 3; // definition, external linkage
int i4; // tentative definition, external linkage
static int i5; // tentative definition, internal linkage
The example continues, but the extern int i3 = 3; line clearly shows that the standard indicates that it should be allowed. Note, however, that examples in the standard are technically not 'normative' (see the foreword in the standard); they are not a definitive statement of what is and is not allowed.
That said, most people most of the time do not use extern and an initializer.

This code is perfectly valid.
But any compiler is free to issue additional (informative or not) diagnostics:
(C99, 5.1.1.3p1 fn 8) "Of course, an implementation is free to produce any number of diagnostics as long as a valid program is still correctly translated."
What a compiler cannot do is not emitting a diagnostic when there is a constraint or syntax violation.
EDIT:
As devnull put in the OP question comments, Joseph Myers from gcc team explains in a bug report questioning this diagnostic:
"This is a coding style warning - the code is valid, but extremely
unidiomatic for C since "extern" is generally expected to mean that the
declaration is not providing a definition of the object."

Related

Why is an implicit extern declaration invalid if there is a prior static declaration?

Consider the following example program:
#include <stdio.h>
static int n = 123;
extern int n;
int main(void) { printf("n is %d\n", n); return 0; }
It compiles successfully with gcc -std=c99 -pedantic myprog.c. n has static linkage according to C99 § 6.2.2 Linkages of identifiers, part 4:
For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
Now remove extern:
#include <stdio.h>
static int n = 123;
int n;
int main(void) { printf("n is %d\n", n); return 0; }
This program does not compile. GCC gives this error:
myprog.c:4:5: error: non-static declaration of ‘n’ follows static declaration
4 | int n;
| ^
myprog.c:3:12: note: previous definition of ‘n’ was here
3 | static int n = 123;
|
Why does this error occur? I thought that the int n; in the second program is supposed to be equivalent to extern int n;. From the C99 standard, § 6.2.2 Linkages of identifiers, part 5:
If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
You've already found the relevant parts in the standard. §4 says that if you declare it extern and there's already an internal linkage variable declared, then the linkage of your extern turns internal too - so it will refer to the same variable, just as if you had written static int n; twice.
These rules are kind of muddy and oddball features like this are obsolete. I don't know the historical reasons why it's there in the first place.
6.11.2 Linkages of identifiers
Declaring an identifier with internal linkage at file scope without the static storage class specifier is an obsolescent feature.
In the latter case you specify no linkage so it's a tentative definition. As per your quoted part in §5, it gets external linkage and then you get a naming collision with the internal linkage identifier in the same translation unit, hence the compiler error.

How is the value of global variable changing in my code?

I have declared global variable again after the main function but It still affects main function. I know that C allows a global variable to be declared again when first declaration doesn’t initialize the variable(It will not work in c++). If I assign the value after the main function it still works with two warning in c but gives error in c++.
I have debugged the code but it never reaches the line int a=10;.
#include <stdio.h>
#include <string.h>
int a;
int main()
{
printf("%d",a);
return 0;
}
/*a=10 works fine with following warnings in c.
warning: data definition has no type or storage class
warning: type defaults to 'int' in declaration of 'a' [-Wimplicit-int]|
but c++ gives the following error
error: 'a' does not name a type|
*/
int a=10;
The output is 10.
Several things:
The first int a; is a tentative declaration; the second int a = 10; is a defining declaration.
a is declared at file scope, so it will have static storage duration - this means that storage for it will be set aside and initialized at program startup (before main executes), even though the defining declaration occurs later in the source code.
Older versions of C allow for implicit int declaration - if a variable or function call appears without a declaration, it is assumed to have type int. C++ does not support implicit declarations, so you will get an error.
Here
int a; /* global declaration */
compiler treats above statement as just a declaration not definition. It looks for definition of a in other translation units, it finds below main() as
int a=10;
Hence the output 10.
To avoid warnings, declare a with extern storage class for e.g
extern int a;
From C Standard#6.9.2p2
2 A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition.....
So, this
int a;
is tentative definition of identifier a.
Couple of points about tentative definition:
If there are no definitions in the same translation unit, then the tentative definition acts as an actual definition with the initializer = 0.
If an actual external definition is found earlier or later in the same translation unit, then the tentative definition just acts as a declaration.
In your program, compiler found the definition of a in the same translation unit:
int a=10;
Hence, you are getting the output 10 when compiling with C compiler.
Now, regarding the error when compiling with C++ compiler:
If you have this statement in your program:
a=10;
This will give error when compile with C++ compiler because you are missing the type specifier which is required. But this code will compile with C complier because, in older version of C (C89/90), if the type specifier is missing then it will default set to int. Of course, you will get warning message when compile with C99 & C11 compiler because this implicit declaration is no longer supported.
If you have this statement in your program:
int a=10;
C++ does not have concept of tentative definitions and int a; is definition in C++. Hence, due to concept of One Definition Rule the C++ compiler will give error - redefinition of 'a'.
all i know, today's c++ compiler cannot run the code:
int a;
int main()
{
printf("%d",a);
return 0;
}
int a=10;
nor
int a;
int main()
{
printf("%d",a);
return 0;
}
a=10;
because c++ detects double declaration of variable.
and
because it cannot initialize a variable outside a method.
error "'a' does not name a type" is because of that (the second) error, c++ expect the first word to be a type for declaration (ex: int, long, char, -etc-), and variable is given.

C - Using extern to access global variable. Case study

I thought externs were to share variables between compilation units. Why does the below code work ? and how does it work exactly ? Is this good practice ?
#include <stdio.h>
int x = 50;
int main(void){
int x = 10;
printf("Value of local x is %d\n", x);
{
extern int x;
printf("Value of global x is %d\n", x);
}
return 0;
}
Prints out :
Value of local x is 10
Value of global x is 50
When you use the extern keyword, the linker finds a symbol with a matching name in object files / libraries / archives. Symbols are, simply speaking, functions and global variables (local variables are just some space on the stack), thus the linker can do it's magic here.
About it being a good practice - global variables in general are not considered a good practice since they cause spaghetti code and 'pollute' the symbols pool.
You might (or might not) be interested to know that GCC (4.9.1) and clang (Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)) have divergent views on the acceptability of the following code, which is a minor adaptation of the code in the question:
#include <stdio.h>
static int x = 50; // static instead of no storage class specifier
int main(void)
{
int x = 10;
printf("Value of local x is %d\n", x);
{
extern int x;
printf("Value of global x is %d\n", x);
}
return 0;
}
I called the source file ext.c.
$ clang -O3 -g -std=c11 -Wall -Wextra -Werror ext.c -o ext
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror ext.c -o ext
ext.c: In function ‘main’:
ext.c:9:20: error: variable previously declared ‘static’ redeclared ‘extern’
extern int x;
^
ext.c: At top level:
ext.c:2:12: error: ‘x’ defined but not used [-Werror=unused-variable]
static int x = 50;
^
cc1: all warnings being treated as errors
$
The problem is to determine which compiler is correct because they can't both be right unless the program is exhibiting undefined behaviour — which, if you bother to read to the end, will turn out to be the case.
The relevant section of the C11 standard is:
6.2.2 Linkages of identifiers
¶1 An identifier declared in different scopes or in the same scope more than once can be
made to refer to the same object or function by a process called linkage.29) There are
three kinds of linkage: external, internal, and none.
¶2 In the set of translation units and libraries that constitutes an entire program, each
declaration of a particular identifier with external linkage denotes the same object or
function. Within one translation unit, each declaration of an identifier with internal
linkage denotes the same object or function. Each declaration of an identifier with no
linkage denotes a unique entity.
¶3 If the declaration of a file scope identifier for an object or a function contains the storage class
specifier static, the identifier has internal linkage.30)
This means that the first or outermost declaration (definition) of x in the code above has internal linkage.
4 For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,31) if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
This paragraph needs detailed deconstruction below.
¶5 If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
In the original code in the question, the second sentence says that the first declaration (definition) of x has external linkage.
¶6 The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.
The x declared (defined) at the start of the function is 'a block scope identifier …' and therefore has no linkage.
¶7 If, within a translation unit, the same identifier appears with both internal and external
linkage, the behavior is undefined.
29) There is no linkage between different identifiers.
30) A function declaration can contain the storage-class specifier static only if it is at file scope; see
6.7.1.
31) As specified in 6.2.1, the later declaration might hide the prior declaration.
Dissecting paragraph 4
Paragraph 4 is the key one here. Restating it and annotating it:
4 For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,31)
The third or innermost declaration of x is declared in a scope in which a prior declaration of that identifier is visible — the int x = 10; declaration is visible (the static int x = 50; declaration is invisible, having been shadowed by the visible declaration). The footnote refers to §6.2.1 Scopes of identifiers but I don't think ithat says anything surprising (however, I'll quote the relevant paragraphs — ¶2 and ¶4 — if you think that's necessary).
if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration.
This does not apply; the prior declaration specifies neither internal nor external linkage.
If no prior declaration is visible, or if the prior declaration specifies no linkage,
There is a prior declaration that's visible, and that declaration specifies no linkage.
then the identifier has external linkage.
So, the innermost x has external linkage, the outermost x has internal linkage, and as a consequence, paragraph 7 says the resulting behaviour is undefined. That means that both compilers are correct; if the behaviour is undefined, any behaviour is correct — and different compilers are allowed to have divergent views on what is correct, and GCC and clang exhibit divergent views. On the whole, GCC's "it is a problem that should be reported" view is safer for the programmer.
In the original code, the outermost x has external linkage, the innermost x also has external linkage, and as a consequence paragraph 7 does not apply, and the innermost declaration of x refers to the outermost declaration (and definition) of x.
Apart from showing that interpreting the standard is hard work, this whole answer (diatribe) also shows that using multiple compilers (if possible on different platforms) is a good idea. It gives you the maximum chance of finding problems. Depending on a single compiler leaves you vulnerable to missing problems that another compiler might spot.

C the same global variable defined in different files

I am reading this code from here(in Chinese). There is one piece of code about testing global variable in C. The variable a has been defined in the file t.h which has been included twice. In file foo.c defined a struct b with some value and a main function. In main.c file, defined two variables without initialized.
/* t.h */
#ifndef _H_
#define _H_
int a;
#endif
/* foo.c */
#include <stdio.h>
#include "t.h"
struct {
char a;
int b;
} b = { 2, 4 };
int main();
void foo()
{
printf("foo:\t(&a)=0x%08x\n\t(&b)=0x%08x\n
\tsizeof(b)=%d\n\tb.a=%d\n\tb.b=%d\n\tmain:0x%08x\n",
&a, &b, sizeof b, b.a, b.b, main);
}
/* main.c */
#include <stdio.h>
#include "t.h"
int b;
int c;
int main()
{
foo();
printf("main:\t(&a)=0x%08x\n\t(&b)=0x%08x\n
\t(&c)=0x%08x\n\tsize(b)=%d\n\tb=%d\n\tc=%d\n",
&a, &b, &c, sizeof b, b, c);
return 0;
}
After using Ubuntu GCC 4.4.3 compiling, the result is like this below:
foo: (&a)=0x0804a024
(&b)=0x0804a014
sizeof(b)=8
b.a=2
b.b=4
main:0x080483e4
main: (&a)=0x0804a024
(&b)=0x0804a014
(&c)=0x0804a028
size(b)=4
b=2
c=0
Variable a and b has the same address in two function, but the size of b has changed. I can't understand how it worked!
You are violating C's "one definition rule", and the result is undefined behavior. The "one definition rule" is not formally stated in the standard as such. We are looking at objects in different source files (aka, translation units), so we concerned with "external definitions". The "one external definition" semantic is spelled out (C11 6.9 p5):
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
Which basically means you are only allowed to define an object at most once. (The otherwise clause allows you to not define an external object at all if it is never used anywhere in the program.)
Note that you have two external definitions for b. One is the structure that you initialize in foo.c, and the other is the tentative definition in main.c, (C11 6.9.2 p1-2):
If the declaration of an identifier for an object has file scope and an initializer, the
declaration is an external definition for the identifier.
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
So you have multiple definitions of b. However, there is another error, in that you have defined b with different types. First note that multiple declarations to the same object with external linkage is allowed. However, when the same name is used in two different source files, that name refers to the same object (C11 6.2.2 p2):
In the set of translation units and libraries that constitutes an entire program, each
declaration of a particular identifier with external linkage denotes the same object or
function.
C puts a strict limitation on declarations to the same object (C11 6.2.7 p2):
All declarations that refer to the same object or function shall have compatible type;
otherwise, the behavior is undefined.
Since the types for b in each of your source files do not actually match, the behavior is undefined. (What constitutes a compatible type is described in detail in all of C11 6.2.7, but it basically boils down to being that the types have to match.)
So you have two failings for b:
Multiple definitions.
Multiple declarations with incompatible types.
Technically, your declaration of int a in both of your source files also violates the "one definition rule". Note that a has external linkage (C11 6.2.2 p5):
If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
But, from the quote from C11 6.9.2 earlier, those int a tentative definitions are external definitions, and you are only allowed one of those from the quote from C11 6.9 at the top.
The usual disclaimers apply for undefined behavior. Anything can happen, and that would include the behavior you observed.
A common extension to C is to allow multiple external definitions, and is described in the C standard in the informative Annex J.5 (C11 J.5.11):
There may be more than one external definition for the identifier of an object, with or
without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).
(Emphasis is mine.) Since the definitions for a agree, there is no harm there, but the definitions for b do not agree. This extension explains why your compiler does not complain about the presence of multiple definitions. From the quote of C11 6.2.2, the linker will attempt to reconcile the multiple references to the same object.
Linkers typically use one of two models for reconciling multiple definitions of the same symbol in multiple translation units. These are the "Common Model" and the "Ref/Def Model". In the "Common Model", multiple objects with the same name are folded into a single object in a union style manner so that the object takes on the size of the largest definition. In the "Ref/Def Model", each external name must have exactly one definition.
The GNU toolchain uses the "Common Model" by default, and a "Relaxed Ref/Def Model", where it enforces a strictly one definition rule for a single translation unit, but does not complain about violations across multiple translation units.
The "Common Model" can be suppressed in the GNU compiler by using the -fno-common option. When I tested this on my system, it caused "Strict Ref/Def Model" behavior for code similar to yours:
$ cat a.c
#include <stdio.h>
int a;
struct { char a; int b; } b = { 2, 4 };
void foo () { printf("%zu\n", sizeof(b)); }
$ cat b.c
#include <stdio.h>
extern void foo();
int a, b;
int main () { printf("%zu\n", sizeof(b)); foo(); }
$ gcc -fno-common a.c b.c
/tmp/ccd4fSOL.o:(.bss+0x0): multiple definition of `a'
/tmp/ccMoQ72v.o:(.bss+0x0): first defined here
/tmp/ccd4fSOL.o:(.bss+0x4): multiple definition of `b'
/tmp/ccMoQ72v.o:(.data+0x0): first defined here
/usr/bin/ld: Warning: size of symbol `b' changed from 8 in /tmp/ccMoQ72v.o to 4 in /tmp/ccd4fSOL.o
collect2: ld returned 1 exit status
$
I personally feel the last warning issued by the linker should always be provided regardless of the resolution model for multiple object definitions, but that is neither here nor there.
References:
Unfortunately, I can't give you the link to my copy of the C11 Standard
What are extern variables in C?
The "Beginner's Guide to Linkers"
SAS Documentation on External Variable Models
Formally, it is illegal to define the same variable (or function) with external linkage more than once. So, from the formal point of view the behavior of your program is undefined.
Practically, allowing multiple definitions of the same variable with external linkage is a popular compiler extension (a common extension, mentioned as such in the language specification). However, in order to be used properly, each definition shall declare it with the same type. And no more than one definition shall include initializer.
Your case does not match the common extension description. Your code compiles as a side effect of that common extension, but its behavior is still undefined.
The piece of code seems to break the one-definition rule on purpose. It will invoke undefined behavior, don't do that.
About the global variable a: don't put definition of a global variable in a header file, since it will be included in multiple .c files, and leads to multiple definition. Just put declarations in the header and put the definition in one of the .c files.
In t.h:
extern int a;
In foo.c
int a;
About the global variable b: don't define it multiple times, use static to limit the variable in a file.
In foo.c:
static struct {
char a;
int b;
} b = { 2, 4 };
In main.c
static int b;
b has the same address because the linker decided to resolve the conflict for you.
sizeof shows different values because sizeof is evaluated at compile time. At this stage, the compiler only knows about one b (the one defined in the current file).
At the time foo is being compiled, the b that is in scope is the two ints vector {2, 4} or 8 bytes when an sizeof(int) is 4.
When main is compiled, b has just been redeclared as an int so a size of 4 makes sense. Also there is probably "padding bytes" added to the struct after "a" such that the next slot (the int) is aligned on 4 bytes boundary.
a and b have the same addresses because they occur at the same points in the file. The fact that b is a different size doesn't matter where the variable begins. If you added a variable c between a and b in one of the files, the address of the bs would differ.

Doubt related to extern keyword usage

AFAIK, extern keyword should be used for declaration and no value can be associated with the variable being declared with extern keyword. But supposing I write a statement like
extern int i = 10;
Should the compiler flag an error for the same? I have seen some compilers being tolerant and ignoring this? Why is this so? What does the 'C' standard says about this?
EDIT: #All, Thanks for the answers. I have a doubt still though. Suppose I have the definition for this variable without the extern linkage in another file say a.c and I add this statement in b.c. Still is it Ok for the compiler not to flag an error? Does it come under redefintion?
That's valid syntax, there is even an essentially identical example in the C99 standard. (See §6.9.2-4.)
It's true that the examples are not normative but I believe it was intended to be legal syntax. The compiler will often output a warning, because it doesn't really accomplish anything.
4 EXAMPLE 1
int i1 = 1; // definition, external linkage
static int i2 = 2; // definition, internal linkage
extern int i3 = 3; // definition, external linkage
int i4; // tentative definition, external linkage
static int i5; // tentative definition, internal linkage
int i1; // valid tentative definition, refers to previous
int i2; // 6.2.2 renders undefined, linkage disagreement
int i3; // valid tentative definition, refers to previous
int i4; // valid tentative definition, refers to previous
int i5; // 6.2.2 renders undefined, linkage disagreement
extern int i1; // refers to previous, whose linkage is external
extern int i2; // refers to previous, whose linkage is internal
extern int i3; // refers to previous, whose linkage is external
extern int i4; // refers to previous, whose linkage is external
extern int i5; // refers to previous, whose linkage is internal
The following code ;
extern int i ;
declares a variable i, but does not instantiate it. If it is not also defined in the same compilation unit, the linker will attempt to resolve it from the object files and libraries that comprise the final executable.
However your example:
extern int i = 10 ;
initialises the object, and therefore must also instantiate it. In this case the extern keyword is redundant because the object is initialised in the same compilation unit (in fact the same statment). It is equivalent to:
extern int i ; // redundant
int i = 10 ;
Although in this last example the extern keyword is redundant, it is exactly equivalent to what you have when a global variable is declared in a header file, and instantiated in a source file that also includes that header (as it should, to allow the compiler to perform type checking).
You can test this as follows:
extern int i ;
int main()
{
i = 10 ;
}
The above will cause a linker error for unresolved variable i. Whereas:
extern int i = 10 ;
int main()
{
i = 10 ;
}
will link without problem.
The extern keyword indicates that the given variable is allocated in a different module. It has nothing to do with access to that variable. It's perfectly legal to assign to assign to an extern variable.
The purpose of extern keyword is to give the entity external linkage. Whether it is used in a declaration in a or definition makes no difference. There's absolutely no error in the code you posted.
If you prefer to think about it in terms of "export vs. import", then extern keyword applied to a non-defining declaration means that we are importing an entity defined in some other translation unit. When extern keyword applied to a definition, it means that we are exporting this entity to be used by other translation units. (Although it is worth noting that "export vs. import" is not exactly a standard way of thinking about the concept of C linkage.)
The reason you won't see the keyword used in definitions very often is because in C file-scope definitions have external linkage by default. So writing
extern int i = 10;
is valid, but redundant, since it is equivalent to plain
int i = 10;
Yet, from time to time in the actual code you might see people using this keyword with function declarations and definitions, even though it is superfluous there as well
extern void foo(int i); /* `extern` is superfluous */
...
extern void foo(int i) /* `extern` is superfluous */
{
/* whatever */
}

Resources