Consider the code:
int main(void)
{
int a;
}
As far as I know, int a; is a definition, as it causes storage to be reserved. Citing the C standard (N1570 Committee Draft — April 12, 2011):
6.7/5 Semantics
A declaration specifies the interpretation and attributes of a set of identifiers. A definition of an identifier is a declaration for that identifier that:
— for an object, causes storage to be reserved for that object;
...
Here comes the question: the compiler may optimize away the storage, since we are not using the variable. Is then int a; a declaration then? And what if we do a printf("%p", &a) in main(void) - certainly now the compiler has to allocate storage, so is the concept of declaration/definition dependent on whether you later use the identifier or not?
The text you quoted from 6.7/5 is actually meant to be interpreted the other way around than what you have done: the text is saying that definitions cause storage to be allocated.
The text which specifies that int a; is a definition is elsewhere.
C is defined in terms of an abstract machine. There is storage allocated in the abstract machine. Whether or not any memory is allocated on your PC is unrelated.
Is then int a; a declaration then?
Yes.
In fact, every definition is also a declaration. A variable can have only one definition, but could have multiple declarations.
int a;
This is a definition
There is a memory allocated for variable a
extern int a;
This is a declaration.
Memory is not allocated because it is not defined.
Once a variable is defined you can use the address of it which is totally legal.
A declaration introduces an identifier and describes its type, be it a type, object, or function. A declaration is what the compiler needs to accept references to that identifier. These are declarations:
extern int bar;
extern int g(int, int);
A definition actually instantiates/implements this identifier. It's what the linker needs in order to link references to those entities. These are definitions corresponding to the above declarations:
int bar;
int g(int lhs, int rhs) {return lhs*rhs;}
Related
This question already has answers here:
In C, is it valid to declare a variable multiple times?
(1 answer)
In C,why is multiple declarations working fine for a global variable but not for a local variable?
(3 answers)
Closed last year.
So far I have understood the following:
A variable declaration is the declaration of a type and name of a variable without allocating memory space for it.
A variable definition means that the variable is declared and memory space is allocated for it.
So it has nothing to do with the initialization of the variable, whether you speak of a definition or a declaration.
In C, a declaration is always a definition e.g. if one write int i;.
But there is one exception. If you write extern int i; no memory space is allocated, only the variable is declared.
So int i; is always declaration and definition at the same time. But extern int i; is just declaration.
Is it true that in C you can declare a variable as often as you want, but you can only define the variable once?
I ask because I've tried the following and the compiler results confuse me. I use gcc and don't set the -std flag.
Neither this program:
int i;
int i;
void main(void){
i = 2;
}
nor this program:
int i=0;
int i;
void main(void){
i = 2;
}
lead to problems. The compiler compiles both without error. I would have expected since I didn't use the "extern" keyword here that the compiler would say something like "error: multiple definition".
But it doesn't give an error message. Is it possible that the compiler automatically writes an "extern" before all global defined "int i;" if I don't initialize them at the same time?
Isn't it then superfluous for the programmer to ever use the extern keyword for variables since the compiler will do that automatically anyway?
I think my considerations are confirmed by the following behavior. The following programs return errors:
int i;
i=0;
void main(void){
i = 2;
}
leads to:
"warning: data definition has no type or storage class
i=0;
warning: type defaults to 'int' in declaration of 'i' [-Wimplicit-int]"
and
float i;
i=0;
void main(void){
i = 2;
}
leads to:
"warning: data definition has no type or storage class
i=0;
warning: type defaults to 'int' in declaration of 'i' [-Wimplicit-int]
error: conflicting types for 'i'
note: previous declaration of 'i' was here
float i;"
So to me again it looks like there is an implicit "extern" before the first int i; respectively float i; is written because they are not assigned a value. As a result, no storage space is allocated for i.
But there is no other file in which storage space is allocated for i. Therefore there is no definition for i and the compiler therefore thinks in the 2nd line that i should be defined here.
Therefore there are no problems with the 1st program because the automatic type assignment fits, but with the 2nd program it no longer fits, which is why an error is returned.
The following program also throws an error:
void main(void){
int i;
int i;
}
If I write the declaration (and thus also the definition) in a scope, the compiler returns the following error message.
"error: redeclaration of 'i' with no linkage int i;
note: previous declaration of 'i' was here int i;"
I can only explain it again with the fact that the compiler does not automatically set an "extern" before a variable that is not a global variable and therefore there are 2 definitions here.
But then I ask myself why is it called redeclaration and not redefinition or multiple definition?
It would be very nice if someone could confirm my assumptions or enlighten me on how to understand it correctly. Many Thanks!
A variable declaration is the declaration of a type and name of a variable without allocating memory space for it.
Even if memory is reserved for an object, it a declaration. We do not exclude definitions from declarations; there are declarations that are definitions and declarations that are not definitions.
The declaration x = 3; causes memory to be reserved for x, but it also makes the name and type of x known, so it declares x.
So int i; is always declaration and definition at the same time.
Not quite. Inside a function, int i; is a definition. Outside of a function, int i; is a tentative definition. This is a special category that was necessary due to the history of C development. The language was not designed all at once with foresight about how it would be used. Different implementors tried different things. When a standard for the C language was developed, the committee working on it had to accommodate diverse existing uses.
When there is a tentative definition, the program can still supply a regular definition later in the translation unit. (The translation unit is the source file being compiled along with all the files included in it.) If the program does not supply a regular definition by the end of the translation unit, then the tentative definition becomes a regular definition as if it had an initializer of zero, as in int i = 0;.
Some C implementations treat multiple tentative definitions of an identifier in different translation units as referring to the same object. Some C implementations treat them as errors. Both behaviors are allowed by the C standard.
Is it true that in C you can declare a variable as often as you want, but you can only define the variable once?
Not always. Variables with no linkage (declared inside a function without static or extern) may not be declared more than once. (An identical declaration can appear inside a nested block, but this declares a new variable with the same name.)
Repeated declarations must have compatible types, and there are additional rules about which repeated declarations are allowed. For example, an identifier may not be declared with static after it has been declared with extern.
The compiler compiles both without error.
As described above, int i; outside a function is a tentative definition. Initially, it acts only as a non-definition declaration. So it may be repeated, and it may be replaced by a regular definition.
So to me again it looks like there is an implicit "extern" before the first int i;
No, there is not. int i; is a tentative definition, and it has nothing to do with the error messages you are getting. The error messages “data definition has no type or storage class” and “type defaults to 'int' in declaration of 'i'” are from the i=0;. This is a statement, not a declaration, but the C grammar does not provide for statements outside of functions. Outside of functions, the compiler is looking for only declarations. So it expects to see a type, as in int i=0;. The first message tells you the compiler does not see a type or a storage class. The second message tells you that, since it did not see a type, it is assuming int. This is a relic of old behavior in C where the int type would be taken as a default, so it could be left off. (Do not use that in new C code.)
The following program also throws an error:
Inside a function, int i; is a definition, so two of them causes multiple definitions of i.
Do you need absolutely need to use extern with an incomplete type for instance int a[]; for it to use an array a definition in a linked file? My logic is that it doesn't reserve memory so it's not a definition but a declaration (like a function prototype, which also doesn't require extern for the compiler to leave it to the linker implicitly). I would test it myself but I can't currently.
Since you ask do you absolutely need to use extern with a declaration of an identifier with a nominally incomplete type, the answer is technically “no,” for two reasons:
The C standard is voluntary. Nothing requires you to obey it.
If you are using the C standard, and you declare int a[]; externally (outside any function) in one translation unit and int a[5] = { 3, -7, 24, 5, 7 }; inside another translation unit, and you use a in the program, the behavior is not defined by the C standard. That is, the C standard “allows” you to do it but does not define the result.
I will come back to explain why the latter is not defined. First, let’s see why the answer to the question you actually wanted to ask is “yes.”
If instead you ask do you need to use extern to get a defined result, and presumably the result that you want, then the answer is “yes.” If you declare int a[]; in a translation unit, it is a tentative definition per C 2018 6.9.2 2:
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
This means that, if you declare int a[]; externally and do not otherwise define it in the same translation unit (the source file with all the included files), it is as if you wrote int a[] = { 0 };, which defines it to be an array of one element. So it is effectively a definition, not just a declaration.
To prevent it from being a tentative definition and becoming a definition, you need to declare it as extern int a[];.
If you do not, then you will have this definition in one source file and the definition in the other file you are linking. Then, if you use a in the program, C 2018 6.9 5 is violated:
… If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
This “shall” is part of the semantics of external definitions, not part of the constraints, so it is governed by 4 2:
If a “shall” or “shall not” requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined…
This explains my second bullet point above.
Adding to that, though, this is one instance where the “undefined behavior” of the C standard is actually in some common use. In common Unix tools, a tentative definition in one translation unit is resolved as desired with a non-tentative definition in other translation units. So “no” is also the answer to your intended question provided you are using tools that support this.
I used repl.it (clang) and it has an interesting result.
Scenario 1:
#include<stdio.h>
int a[5] = {1,2,3};
#include<stdio.h>
extern int a[];
int main()
{
printf("%d", a[1]); // '2'
printf("%lu", sizeof(a)); // error 'invalid application of sizeof to incomplete type int []'
return 0;
}
Scenario 2:
#include<stdio.h>
int a[5] = {1,2,3};
#include<stdio.h>
int a[];
int main()
{
printf("%d", a[1]); // '2'
printf("%lu", sizeof(a)); // error 'invalid application of sizeof to incomplete type int []'
return 0;
}
Scenario 3:
#include<stdio.h>
extern int a[];
int main()
{
printf("%d", a[1]); // '0'
printf("%lu", sizeof(a)); // error 'invalid application of sizeof to incomplete type int []'
return 0;
}
Scenario 4:
#include<stdio.h>
int a[5] = {1,2,3};
#include<stdio.h>
extern int a[2];
int main()
{
printf("%d", a[1]); // '2'
printf("%lu", sizeof(a)); // '8'
return 0;
}
Scenario 5:
#include<stdio.h>
int a = 2;
#include<stdio.h>
int a;
int main()
{
printf("%d", a); // '2'
return 0;
}
Only static int a; will produce '0' but int a; appears to be taken implicitly as extern int a; although (extern) int a = 1; would be taken as a multiple definition error if initialised in the other file (if not initialised in the other file i.e. int a;, the other file will use the overridden extern initialisation in the main file). (extern) int a; in one file and int a; in the other file causes a single zero-initialised uninitialised int a to be used in both files. extern int a; and nothing in the other file causes an error.
Refer to this as to why.
In the following statements first three are definitions and the last one is the declaration:
auto int i;
static int j;
register int k;
extern int l;
What's the reason for the same?
In first three(int i, static int j, register int k) is a definition. It denotes the space for the integer to be in this translation unit and advices the linker to link all references to i against this entity. If you have more or less than exactly one of these definitions, the linker will complain.
But in last extern int l, is a declaration, since it just introduces/specifies l, no new memory address/space is allocated. You can have as many extern int l in each compilation unit as you want.
A declaration introduces names into a translation unit or redeclares names introduced by previous declarations.
I assume the question is about the terms declaration and definition in C.
A declaration tells the compiler name and type of "something".
A definition is a declaration, but additionally "creates" the "something" that is declared. E.g. for a variable, this would introduce some storage space for this variable.
In your first three examples, the variables are actually created. The storage classes auto, static and register all just specify a storage duration. In contrast, the storage class extern tells the compiler that this variable is known, but it might exist in a different translation unit.
Maybe an example comparing the declaration and definition of functions will make the concept easier to understand:
// function declaration:
int foo(int x);
// (now we know a function foo should be "somewhere", but it doesn't exist yet)
// function definition:
int foo(int x) {
return x+1;
}
When we declare any global variable, for instance
int x;
it is equivalent to
extern int x;
Now by default global variables are initialized to 0 by the compiler, which means they are allocated memory. But if I simply write
extern int x;
then this will only declare the variable, while no memory would be allocated to it. So, my query is that if I write extern before int x or I do not write it, in case of global variables, how is the compiler treating them differently? In the case where I simply write int x, it allocates memory and simultaneously it puts extern before int x, while in the case where I write extern int x, it only declares the variable while no memory is allocated to it. Please clarify how the compiler is behaving in both ways.
The very premise of your question is incorrect. This
int x;
is a tentative definition (which will turn into a normal definition of x by the end of the translation unit).
This
extern int x;
is a non-defining declaration, which is not a definition at all.
They are not even remotely equivalent.
A loose equivalent of your original definition would be
extern int x = 0;
This is a definition. But this is not an exact equivalent, since this definition is not tentative.
Keyword extern turns an object definition into a non-defining declaration if (and only if), there is no explicit initializer. If an explicit initializer is present, a definition remains a definition, even if you add extern to it.
This can be answered by understanding external object definition and Tentative definition.
Quoting C11, chapter §6.9.2, (emphasis mine)
A declaration of an identifier for an object that has file scope without an initializer, and
without a storage-class specifier or with the storage-class specifier static, constitutes a
tentative definition. If a translation unit contains one or more tentative definitions for an
identifier, and the translation unit contains no external definition for that identifier, then
the behavior is exactly as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation unit, with an initializer
equal to 0.
From C99 standard 6.2.3:
If the declaration of an identifier of an object has file scope and no storage-class specifier, its linkage is external.
and 6.7
A declaration specifies the interpretation and attributes of a set of identifiers. A definition of an identifier is a declaration for that identifier that:
— for an object, causes storage to be reserved for that object;
— for a function, includes the function body;99)
— for an enumeration constant or typedef name, is the (only) declaration of the identifier.
Unfortunately, I haven't found any further description on when the compiler shall regard the external declaration as a definition (which means the type must be complete and storage size is calculated).
So I did some experiments. First I noticed that:
struct A a;
int main() {
}
Is invalid, gcc says the type A is incomplete and it doesn't know how to allocate the storage for a.
However, interestingly, we have the following valid code:
struct A a;
int main() {
}
struct A {int x;};
It's also reasonable since type A is completed at the end of the file. From two examples above, we can deduce that external declaration is checked at the end of file scope. (Still don't know where does the standard say about this)
However, array declaration is exceptional. The modified code is not longer valid:
struct A a[1];
int main() {
}
struct A {int x;};
And C99 standard does talk about this, it says elements of an array must be of completed type. So question comes about: is struct A a[1] a definition or a declaration? Don't be hasty to answer it. Check the following examples.
Here we have two files: a.c and b.c. In a.c:
#include <stdio.h>
int arr[10];
void a_arr_info() {
printf("%lu at %lx\n", sizeof arr, (size_t)arr);
}
while in b.c:
#include <stdio.h>
int arr[20];
void b_arr_info() {
printf("%lu at %lx\n", sizeof arr, (size_t)arr);
}
int main() {
a_arr_info();
b_arr_info();
}
The result is astonishing. The output shows that arr in both files refers to the same address. Which can be understood because arr are both in file scope, thus they're external linkage. The problem is, they have different size. In what file did the compiler take the declaration as definition and allocate the memory?
Why do I ask about this? Because, um, I'm working on a simplified C compiler project (course homework). So it might be important for me to figure it out. Although the homework does not go as far as this, I'm quite curious and would like to know more. Thanks!
It is called a tentative definition
A declaration of an identifier for an object that has file scope
without an initializer, and without a storage-class specifier or with
the storage-class specifier static, constitutes a tentative
definition. If a translation unit contains one or more tentative
definitions for an identifier, and the translation unit contains no
external definition for that identifier, then the behavior is exactly
as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation
unit, with an initializer equal to 0.
So any compilation unit (.o file) that has such a tentative definition realizes the object. Linking two such units together has undefined behavior, you should usually encounter a "multiply defined symbol" error. Some compiler/linkers just do it, you have to ensure that such symbols have same size and type.