Reading a lot of definition of namespace and scopes cannot understand exactly the difference between the two terms.
For example:
If an identifier designates two different entities in the same name
space, the scopes might overlap.
It is really confusing me. Can someone clarify it as simple as possible underlining the difference.
You left off the second part of that statement:
If so, the scope of one entity (the inner scope) will end
strictly before the scope of the other entity (the outer scope). Within the inner scope, the
identifier designates the entity declared in the inner scope; the entity declared in the outer
scope is hidden (and not visible) within the inner scope.
Online draft of the C2011 standard, §6.2.1, para 4.
Example:
void foo( void )
{
int x = 42;
printf( "x = %d\n", x ); // will print 42
do
{
double x = 3.14;
printf( "x = %f\n", x ); // will print 3.14
} while ( 0 );
printf( "x = %d\n", x ); // will print 42 again
}
You have used the same identifier x to refer to two different objects in the same namespace1. The scope of the outer x overlaps the scope of the inner x. The scope of the inner x ends strictly before the scope of the outer x.
The inner x "shadows" or "hides" the outer x.
C defines four namespaces - label names (disambiguated by the `goto` keyword and `:`), struct/union/enum tag names (disambiguated by the `struct`, `union`, and `enum` keywords), struct and union member names (disambiguated by the `.` and `->` operators), and everything else (variable names, function names, typedef names enumeration constants, etc.).
"Namespace" in C has to do with the kinds of things that are named. Section 6.2.3 of the standard distinguishes four kinds of namespaces
one for labels
one each for the tags of structures, unions, and enumerations
one for the members of each structure or union
one for all other identifiers
Thus, for example, a structure tag never collides with a variable name or the name of any structure member:
struct foo {
int foo;
};
struct foo foo;
The usage of the same identifier, foo, for each of the entities it designates therein is permitted and usable because structure tags, structure members, and ordinary variable names all belong to different name spaces. The language supports this by disambiguating the uses of identifiers via language syntax.
"Scope", on the other hand, has to do with where -- in terms of the program source -- a given identifier is usable, and what entity it designates at that point. As section 6.2.1 of the standard puts it:
For each different entity that an identifier designates, the identifier is visible (i.e., can be used) only within a region of program text called its scope. Different entities designated by the same identifier either have different scopes, or are in different name spaces.
Identifiers belonging to the same namespace may have overlapping scope (subject to other requirements). The quotation in your question is an excerpt from paragraph 4 of section 6.2.1; it might help you to read it in context. As an example, however, consider
int bar;
void baz() {
int bar;
// ...
}
The bar outside the function has file scope, starting immediately after its declaration and continuing to the end of the enclosing translation unit. The one inside the function designates a separate, local variable; it has block scope, starting at the end of its declaration and ending at the closing brace of the innermost enclosing block. Within that block, the identifier bar designates the local variable, not the file-scoped one; the file-scoped bar is not directly accessible there.
Related
What name space is a typedef name in? Consider this code:
#include <stdio.h>
typedef struct x { // 'x' in tag name space
int x; // 'x' in member name space
int y;
} x; // ??
int main() {
x foo = { 1, 2 };
int x = 3; // 'x' in ordinary identifier name space
printf("%d %d %d\n", foo.x, foo.y, x);
}
This prints out '1 2 3' with gcc 4.4.7 (and g++ 4.4.7), so type names are separate from tag, member, and ordinary identifier names.
This code also compiles and runs on gcc/g++ 4.4.7, producing '1, 2':
#include <stdio.h>
typedef struct x { // 'x' in tag namespace
int x; // 'x' in member namespace
int y;
} x;
int main() {
x x = { 1, 2 };
printf("%d %d\n", x.x, x.y);
}
How are the x identifiers disambiguated in this case?
EDIT
A clarification, I hope. Consider these two lines from above:
x foo = { 1, 2 };
int x = 3; // 'x' in ordinary identifier name space
When the second line is executed, identifier x is in scope, and should logically be in the 'ordinary identifier' namespace. There doesn't appear to be a new scope at this point, because there is no opening brace between lines 1 and 2. So the second x can't hide the first x, and the second x is in error. What's the flaw in this argument, and how does this apply to the x x case? My assumption was that the flaw was that type names somehow had a different non-obvious name space, hence the title of this question.
It works not due to namespaces (the new type name and the variable identifier are in the same ordinary namespace), but due to scoping.
6.2.1 Scopes of identifiers
2 For each different entity that an identifier designates, the
identifier is visible (i.e., can be used) only within a region of
program text called its scope. Different entities designated by the
same identifier either have different scopes, or are in different name
spaces. There are four kinds of scopes: function, file, block, and
function prototype. (A function prototype is a declaration of a
function that declares the types of its parameters.)
4 Every other identifier has scope determined by the placement of
its declaration (in a declarator or type specifier). If the declarator
or type specifier that declares the identifier appears outside of any
block or list of parameters, the identifier has file scope, which
terminates at the end of the translation unit. If the declarator or
type specifier that declares the identifier appears inside a block or
within the list of parameter declarations in a function definition,
the identifier has block scope, which terminates at the end of the
associated block. If the declarator or type specifier that declares
the identifier appears within the list of parameter declarations in a
function prototype (not part of a function definition), the identifier
has function prototype scope, which terminates at the end of the
function declarator. If an identifier designates two different
entities in the same name space, the scopes might overlap. If so, the
scope of one entity (the inner scope) will end strictly before the
scope of the other entity (the outer scope). Within the inner scope,
the identifier designates the entity declared in the inner scope; the
entity declared in the outer scope is hidden (and not visible) within
the inner scope.
The variable named x is an an inner scope. So it hides the entity named x in the outer scope. Inside the scope of main, following the declaration of x x, it's a variable name.
The interesting bit is that in x x = { 1, 2 }; the meaning of x is changed mid declaration. In the beginning it denotes the type name, but once the declarator introduces an identifier, x starts denoting the variable.
Regarding your edit "What's the flaw in this argument?" Note that scopes may overlap (as the preceding paragraph mentions). The definition of the type alias is actually at file scope. The block scope of main is a new inner scope that overlaps the outer scope. That's why it can be used to hide the previous meaning of x. Had you tried to do this at file scope:
typedef struct x { /* ... */ } x;
int x = 1; // immediately at file scope
It would be ill-formed. Because now indeed the declarations appear at the exact same scope.
I'm reading the N1570 Standard and have a problem to understand the wording of the name space definition. Here is it:
1 If more than one declaration of a particular identifier is visible
at any point in a translation unit, the syntactic context
disambiguates uses that refer to different entities. Thus, there are
separate name spaces for various categories of identifiers, as
follows:
— label names (disambiguated by the syntax of the label
declaration and use);
— the tags of structures, unions, and
enumerations (disambiguated by following any32) of the keywords
struct, union, or enum);
— the members of structures or unions; each
structure or union has a separate name space for its members
(disambiguated by the type of the expression used to access the member
via the . or -> operator);
— all other identifiers, called ordinary
identifiers (declared in ordinary declarators or as enumeration
constants).
32) There is only one name space for tags even though three are possible.
Here they are talking about in case of more than 1 declaration of particular identifiers is visible. Now words something like "To access an identifier one shall specify its namespace" or "To access an identifier in a specific namespace...".
Let me show an example first (this is strictly for understanding purpose, dont write code like this, ever)
#include <stdio.h>
int main(void)
{
int here = 0; //.......................ordinary identifier
struct here { //.......................structure tag
int here; //.......................member of a structure
} there;
here: //......... a label name
here++;
printf("Inside here\n");
there.here = here; //...........no conflict, both are in separate namespace
if (here > 2) {
return 0;
}
else
goto here; //......... a label name
printf("Hello, world!\n"); // control does not reach here..intentionally :)
return 0;
}
You see usage of identifier here. They belong to separate namespace(s) according to the rule, hence this program is fine.
However, say, for example, you change the structure variable name, from there to here, and you'll see a conflict, as then, there would be two separate declaration of same identifier (ordinary identifier) in the same namespace.
The C standard states (emphasis mine):
If an identifier designates two different entities in the same name space, the scopes might overlap. [...]
(section 6.2.1.4 from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
When can an identifier refer to two different entities but their scopes do not overlap?
Or, put differently, why is there the word "might" in the quote?
These scopes for name overlap:
int f(void) {
int name = 4;
{
int name = 6;
}
}
These ones do not overlap:
int f(void) {
{
int name = 4;
}
{
int name = 6;
}
}
Read it as “The scopes might overlap, if an identifier designates two different entities in the same name space.” That is, the sentence is saying the scopes might overlap, and it is explaining the condition for which that occurs. English is unfortunately imprecise. This sentence is not meant to express the logic statement that if an identifier designates two entities in the same name space, there exist programs in which they overlap and there exist programs in which they do not. It expresses the fact that scopes might overlap and the fact that this occurs when an identifier designates two different entities in the same name space.
I think the word might refer to possibilty of the general case rather than probability of hapening (that means is allowed to happen and when it happens there would be an overlap).
And the following lines indactes that by telling what happens in this case (the inner scope would be a strict subscope of the outer one and in this scope we will be using the intety defined inside this inner scope
As discussed in this question, GCC defines nonstandard unary operator && to take the address of a label.
Why does it define a new operator, instead of using the existing semantics of the & operator, and/or the semantics of functions (where foo and &foo both yield the address of the function foo())?
Label names do not interfere with other identifiers, because they are only used in gotos. A variable and a label can have the same name, and in standard C and C++ it's always clear from the context what is meant. So this is perfectly valid:
name:
int name;
name = 4; // refers to the variable
goto name; // refers to the label
The distinction between & and && is thus needed so the compiler knows what kind of name to expect:
&name; // refers to the variable
&&name; // refers to the label
GCC added this extension to be used in initializing a static array that will serve as a jump table:
static void *array[] = { &&foo, &&bar, &&hack };
Where foo, bar and hack are labels. Then a label can be selected with indexing, like this:
goto *array[i];
Standard says that
C11: 6.2.1 Scopes of identifiers (p1):
An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a typedef name; a label name; a macro name; or a macro parameter.
Further it says in section 6.2.3:
If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows:
— label names (disambiguated by the syntax of the label declaration and use);
— the tags of structures, unions, and enumerations (disambiguated by following any32) of the keywords struct, union, or enum);
— the members of structures or unions; each structure or union has a separate name space for its members (disambiguated by the type of the expression used to access the member via the . or -> operator);
— all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
This means that an object and a label can be denoted by same identifier. At this point, to let the compiler know that the address of foo is the address of a label, not the address of an object foo (if exists), GCC defined && operator for address of label.
If I have enums like:
enum EnumA
{
stuffA = 0
};
enum enumAA
{
stuffA = 1
};
What happens here when you refer to stuffA? I thought you would refer to them like EnumA.stuffA and EnumB.stuffA as in Java, but that doesn't seem to be the case in C.
enums don't introduce new scope.
In your example, the second enum wouldn't compile due to the stuffA name clash.
To avoid name clashes, it is a common practice to give the elements of an enum a common prefix. Different prefixes would be used for different enums:
enum EnumA
{
EA_stuffA = 0
};
enum EnumAA
{
EAA_stuffA = 1
};
The enumeration constants are in the global name space (more precisely, the ordinary identifiers name space, contrasted with the labels, tags, and structure/union member namespaces), so you get a compilation error on the second stuffA.
You cannot use two different values for the same enumeration name (nor the same value specified twice) in a single translation unit.
As the others already said enumeration constants must be unique in the actual scope where they are defined. But with them as with other identifiers it is allowed to redefine them in another scope. Eg.
enum EnumA
{
stuffA = 0
};
void func(void) {
enum enumAA
{
stuffA = 1
};
// do something
}
would be fine. But such redefinitions in different scopes are often frowned upon and should be well documented, otherwise you will quickly loose yourself and others.
As mentioned, this won't compile because stuffA is defined twice. Enum values are simply referred to by the enumeration (that is "stuffA" rather than EnumA.stuffA). You can even use them on types that aren't enums (such as integers). Enums are sometimes used this way with ints, similar to the way one would #define constants.
This answer shows how the rules of C 2018 preclude the same identifier from being used as a member of two different enumerations. It is a language-lawyer view, intended to show how this requirement arises out of the language of the standard.
6.2.3, “Name spaces of identifiers,” tells us:
If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows:
…
— all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
Thus, all enumerator constants and ordinary declarators exist in one name space. (The name spaces omitted above are for labels [for goto statements]; tags of structures, unions, and enumerations [the name after a struct, as in struct foo]; and members of structures or unions [each has its own name space]).
6.7, "Declarations," tells us in paragraph 5 that:
A definition of an identifier is a declaration for that identifier that:
…
for an enumeration constant, is the (only) declaration of the identifier;
…
So the standard indicates that there is only one definition of an enumeration constant. Additionally, 6.2.1, “Scopes of identifiers,” tells us in paragraph 1:
An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a typedef name; a label name; a macro name; or a macro parameter. The same identifier can denote different entities at different points in the program. A member of an enumeration is called an enumeration constant.
Observe that this tells that if foo identifies an enumeration constant, it identifies a member of an enumeration—it is a particular member of a particular enumeration. It cannot identify both a member of enum A and a member of enum B. Therefore, if we had the code:
enum A { foo = 1 };
enum B { foo = 1 };
at the point where foo appears the second time, it is an identifier for foo in enum A, and therefore it cannot be a member of enum B.
(The sentence about an identifier denoting different entities at different points is introducing the concept of scope. Further paragraphs in that clause explain the concept of scope and the four kinds of scope: function, file, block, and function prototype. These do not affect the above analysis because the code above is within one scope.)
Depending on where you declare these enums, you could also declare new scopes using the namespace keyword.
NOTE: I wouldn't recommend doing this, I'm just noting that it's possible.
Instead, it would be better to use a prefix as noted in the other examples.
namespace EnumA
{
enum EnumA_e
{
stuffA = 0
};
};
namespace EnumAA
{
enum enumAA_e
{
stuffA = 1
};
};