As discussed in this question, GCC defines nonstandard unary operator && to take the address of a label.
Why does it define a new operator, instead of using the existing semantics of the & operator, and/or the semantics of functions (where foo and &foo both yield the address of the function foo())?
Label names do not interfere with other identifiers, because they are only used in gotos. A variable and a label can have the same name, and in standard C and C++ it's always clear from the context what is meant. So this is perfectly valid:
name:
int name;
name = 4; // refers to the variable
goto name; // refers to the label
The distinction between & and && is thus needed so the compiler knows what kind of name to expect:
&name; // refers to the variable
&&name; // refers to the label
GCC added this extension to be used in initializing a static array that will serve as a jump table:
static void *array[] = { &&foo, &&bar, &&hack };
Where foo, bar and hack are labels. Then a label can be selected with indexing, like this:
goto *array[i];
Standard says that
C11: 6.2.1 Scopes of identifiers (p1):
An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a typedef name; a label name; a macro name; or a macro parameter.
Further it says in section 6.2.3:
If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows:
— label names (disambiguated by the syntax of the label declaration and use);
— the tags of structures, unions, and enumerations (disambiguated by following any32) of the keywords struct, union, or enum);
— the members of structures or unions; each structure or union has a separate name space for its members (disambiguated by the type of the expression used to access the member via the . or -> operator);
— all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
This means that an object and a label can be denoted by same identifier. At this point, to let the compiler know that the address of foo is the address of a label, not the address of an object foo (if exists), GCC defined && operator for address of label.
Related
I'm reading the N1570 Standard and have a problem to understand the wording of the name space definition. Here is it:
1 If more than one declaration of a particular identifier is visible
at any point in a translation unit, the syntactic context
disambiguates uses that refer to different entities. Thus, there are
separate name spaces for various categories of identifiers, as
follows:
— label names (disambiguated by the syntax of the label
declaration and use);
— the tags of structures, unions, and
enumerations (disambiguated by following any32) of the keywords
struct, union, or enum);
— the members of structures or unions; each
structure or union has a separate name space for its members
(disambiguated by the type of the expression used to access the member
via the . or -> operator);
— all other identifiers, called ordinary
identifiers (declared in ordinary declarators or as enumeration
constants).
32) There is only one name space for tags even though three are possible.
Here they are talking about in case of more than 1 declaration of particular identifiers is visible. Now words something like "To access an identifier one shall specify its namespace" or "To access an identifier in a specific namespace...".
Let me show an example first (this is strictly for understanding purpose, dont write code like this, ever)
#include <stdio.h>
int main(void)
{
int here = 0; //.......................ordinary identifier
struct here { //.......................structure tag
int here; //.......................member of a structure
} there;
here: //......... a label name
here++;
printf("Inside here\n");
there.here = here; //...........no conflict, both are in separate namespace
if (here > 2) {
return 0;
}
else
goto here; //......... a label name
printf("Hello, world!\n"); // control does not reach here..intentionally :)
return 0;
}
You see usage of identifier here. They belong to separate namespace(s) according to the rule, hence this program is fine.
However, say, for example, you change the structure variable name, from there to here, and you'll see a conflict, as then, there would be two separate declaration of same identifier (ordinary identifier) in the same namespace.
In following code, I have declared a structure member variable as a same name of structure name.
struct st
{
int st;
};
int main()
{
struct st t;
t.st = 7;
return 0;
}
I wonder, it's working fine on GCC compiler and doesn't give a conflict error.
So,
How does the compiler know structure name and variable name?
What mechanism compiler internally use?
Yes, it's valid. The struct tag and the struct members are in different namespace.
C11, 6.2.3 Name spaces of identifiers:
If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows:
label names (disambiguated by the syntax of the label declaration and use);
the tags of structures, unions, and enumerations (disambiguated by following any32) of the keywords struct, union, or enum);
the members of structures or unions; each structure or union has a separate name space for its members (disambiguated by the type of the expression used to access the member via the . or -> operator);
all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
The name of the structure type is struct st. Not just st, so there's no conflict at all.
Reading a lot of definition of namespace and scopes cannot understand exactly the difference between the two terms.
For example:
If an identifier designates two different entities in the same name
space, the scopes might overlap.
It is really confusing me. Can someone clarify it as simple as possible underlining the difference.
You left off the second part of that statement:
If so, the scope of one entity (the inner scope) will end
strictly before the scope of the other entity (the outer scope). Within the inner scope, the
identifier designates the entity declared in the inner scope; the entity declared in the outer
scope is hidden (and not visible) within the inner scope.
Online draft of the C2011 standard, §6.2.1, para 4.
Example:
void foo( void )
{
int x = 42;
printf( "x = %d\n", x ); // will print 42
do
{
double x = 3.14;
printf( "x = %f\n", x ); // will print 3.14
} while ( 0 );
printf( "x = %d\n", x ); // will print 42 again
}
You have used the same identifier x to refer to two different objects in the same namespace1. The scope of the outer x overlaps the scope of the inner x. The scope of the inner x ends strictly before the scope of the outer x.
The inner x "shadows" or "hides" the outer x.
C defines four namespaces - label names (disambiguated by the `goto` keyword and `:`), struct/union/enum tag names (disambiguated by the `struct`, `union`, and `enum` keywords), struct and union member names (disambiguated by the `.` and `->` operators), and everything else (variable names, function names, typedef names enumeration constants, etc.).
"Namespace" in C has to do with the kinds of things that are named. Section 6.2.3 of the standard distinguishes four kinds of namespaces
one for labels
one each for the tags of structures, unions, and enumerations
one for the members of each structure or union
one for all other identifiers
Thus, for example, a structure tag never collides with a variable name or the name of any structure member:
struct foo {
int foo;
};
struct foo foo;
The usage of the same identifier, foo, for each of the entities it designates therein is permitted and usable because structure tags, structure members, and ordinary variable names all belong to different name spaces. The language supports this by disambiguating the uses of identifiers via language syntax.
"Scope", on the other hand, has to do with where -- in terms of the program source -- a given identifier is usable, and what entity it designates at that point. As section 6.2.1 of the standard puts it:
For each different entity that an identifier designates, the identifier is visible (i.e., can be used) only within a region of program text called its scope. Different entities designated by the same identifier either have different scopes, or are in different name spaces.
Identifiers belonging to the same namespace may have overlapping scope (subject to other requirements). The quotation in your question is an excerpt from paragraph 4 of section 6.2.1; it might help you to read it in context. As an example, however, consider
int bar;
void baz() {
int bar;
// ...
}
The bar outside the function has file scope, starting immediately after its declaration and continuing to the end of the enclosing translation unit. The one inside the function designates a separate, local variable; it has block scope, starting at the end of its declaration and ending at the closing brace of the innermost enclosing block. Within that block, the identifier bar designates the local variable, not the file-scoped one; the file-scoped bar is not directly accessible there.
I've read in SO about different namespaces in C where the type are defined, e.g. there is a namespace for Structs and Unions and a namespace for typedefs.
Is namespace the exact name for this? How many namespaces exist in C?
see 6.2.3
from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
6.2.3 Name spaces of identifiers
If more than one declaration of a particular identifier is visible at
any point in a translation unit, the syntactic context disambiguates uses
that refer to different entities.
Thus, there are separate name spaces for various categories of identifiers,
as follows:
— label names (disambiguated by the syntax of the label declaration and use);
— the tags of structures, unions, and enumerations (disambiguated by
following any32) of the keywords struct, union, or enum);
— the members of structures or unions; each structure or union has a
separate name space for its members (disambiguated by the type of the
expression used to access themember via the . or -> operator);
— all other identifiers, called ordinary identifiers (declared in ordinary
declarators or as enumeration constants).
I am not sure if "namespace" is the right word here, but I think I know what you mean.
You can do
union name1 { int i; char c; };
struct name2 { int i; char c; };
enum name3 { A, B, C };
typedef int name4;
int name5;
Here name1, name2 and name3 are in distinct "namespaces" (I'll keep that word for now), as they don't collide with each other.
This implies that using them requires to prefix their use with the respective keyword:
struct name1 var; // valid
name1 var; // invalid
On the other hand, name4 and name5 live in the global "namespace" and collide. So after having typedef int name4;, you cannot define a variable with that name name4.
BTW: The labels as well define their own namespace.
If I have enums like:
enum EnumA
{
stuffA = 0
};
enum enumAA
{
stuffA = 1
};
What happens here when you refer to stuffA? I thought you would refer to them like EnumA.stuffA and EnumB.stuffA as in Java, but that doesn't seem to be the case in C.
enums don't introduce new scope.
In your example, the second enum wouldn't compile due to the stuffA name clash.
To avoid name clashes, it is a common practice to give the elements of an enum a common prefix. Different prefixes would be used for different enums:
enum EnumA
{
EA_stuffA = 0
};
enum EnumAA
{
EAA_stuffA = 1
};
The enumeration constants are in the global name space (more precisely, the ordinary identifiers name space, contrasted with the labels, tags, and structure/union member namespaces), so you get a compilation error on the second stuffA.
You cannot use two different values for the same enumeration name (nor the same value specified twice) in a single translation unit.
As the others already said enumeration constants must be unique in the actual scope where they are defined. But with them as with other identifiers it is allowed to redefine them in another scope. Eg.
enum EnumA
{
stuffA = 0
};
void func(void) {
enum enumAA
{
stuffA = 1
};
// do something
}
would be fine. But such redefinitions in different scopes are often frowned upon and should be well documented, otherwise you will quickly loose yourself and others.
As mentioned, this won't compile because stuffA is defined twice. Enum values are simply referred to by the enumeration (that is "stuffA" rather than EnumA.stuffA). You can even use them on types that aren't enums (such as integers). Enums are sometimes used this way with ints, similar to the way one would #define constants.
This answer shows how the rules of C 2018 preclude the same identifier from being used as a member of two different enumerations. It is a language-lawyer view, intended to show how this requirement arises out of the language of the standard.
6.2.3, “Name spaces of identifiers,” tells us:
If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows:
…
— all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
Thus, all enumerator constants and ordinary declarators exist in one name space. (The name spaces omitted above are for labels [for goto statements]; tags of structures, unions, and enumerations [the name after a struct, as in struct foo]; and members of structures or unions [each has its own name space]).
6.7, "Declarations," tells us in paragraph 5 that:
A definition of an identifier is a declaration for that identifier that:
…
for an enumeration constant, is the (only) declaration of the identifier;
…
So the standard indicates that there is only one definition of an enumeration constant. Additionally, 6.2.1, “Scopes of identifiers,” tells us in paragraph 1:
An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a typedef name; a label name; a macro name; or a macro parameter. The same identifier can denote different entities at different points in the program. A member of an enumeration is called an enumeration constant.
Observe that this tells that if foo identifies an enumeration constant, it identifies a member of an enumeration—it is a particular member of a particular enumeration. It cannot identify both a member of enum A and a member of enum B. Therefore, if we had the code:
enum A { foo = 1 };
enum B { foo = 1 };
at the point where foo appears the second time, it is an identifier for foo in enum A, and therefore it cannot be a member of enum B.
(The sentence about an identifier denoting different entities at different points is introducing the concept of scope. Further paragraphs in that clause explain the concept of scope and the four kinds of scope: function, file, block, and function prototype. These do not affect the above analysis because the code above is within one scope.)
Depending on where you declare these enums, you could also declare new scopes using the namespace keyword.
NOTE: I wouldn't recommend doing this, I'm just noting that it's possible.
Instead, it would be better to use a prefix as noted in the other examples.
namespace EnumA
{
enum EnumA_e
{
stuffA = 0
};
};
namespace EnumAA
{
enum enumAA_e
{
stuffA = 1
};
};