From cppreference:
1) Label name space: all identifiers declared as labels.
2) Tag names: all identifiers declared as names of structs, unions and enumerated types.
3) Member names: all identifiers declared as members of any one struct or union. Every struct and union introduces its own name space of this kind.
4) All other identifiers, called ordinary identifiers to distinguish from (1-3) (function names, object names, typedef names, enumeration constants).
This allows for code like this (among other things):
struct Point { int x, y; };
struct Point Point;
This code seems somewhat unclear to me as Point can refer to both a type and an instance of a struct. What was the motivation behind having separate name spaces for tags and other identifiers?
The actual question posed is
What was the motivation behind having separate name spaces for tags and other identifiers?
This can be answered only by reference to the standard committee's rationale document, which in fact does address the matter, however briefly:
Pre-C89 implementations varied considerably in the number of separate name spaces maintained. The position adopted in the Standard is to permit as many separate name spaces as can be distinguished by context, except that all tags (struct, union, and enum) comprise a single name space.
(C99 rationale document,* section 6.2.3)
Thus, it is explicitly intentional that code such as
struct point { int point; } point = { .point = 0 };
goto point;
point:
return point.point;
is permitted. My interpretation of the rationale is that the intention was to be unrestrictive, though it remains unclear why the different kinds of tags were not given separate namespaces. This could not have been accidental, so one or more parties represented on the committee must have opposed separate tag namespaces, and they managed to prevail. Such opposition could very well have been for business instead of technical reasons.
*As far as I am aware, there is no rationale document for the C2011 standard. At least, not yet.
Related
I know how namespaces work in C++, but I´m a little bit confused of how they work in C. So, I did a bit of research about name spaces in C.
First, the respective section in ISO/IEC 9899:2018 (C18), section 6.2.3:
6.2.3 Name spaces of identifiers
1 If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows:
— label names (disambiguated by the syntax of the label declaration and use); — the tags of structures, unions, and enumerations (disambiguated by following any(32)) of the keywords struct, union, or enum);
— the members of structures or unions; each structure or union has a separate name space for its members(disambiguated by the type of the expression used to access the member via the . or -> operator);
— all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).
32) There is only one name space for tags even though three are possible.
So this gives me a bit more understanding of the term in C and seems to generally have the same kind of purpose as in C++. But unfortunately, there is nothing further said in the standard about how name spaces work in C.
Apparently, it has something to do with the distinction between entities that share the same identifier and, as opposed to C++, where we declaring namespaces like:
namespace ctrl1
{
int max = 245;
}
and using namespaces, like:
using namespace ctrl1;
or
int a = ctrl1::max;
in C, the compiler is be able to disambiguate a certain use of one object automatically if the respective identifier is used. Correct me, if I´m wrong.
How does that work? How does the compiler know if he shall use one entity instead of the other in C?
I have read Name spaces in c++ and c but the question is more focused on C++ and focused on the handling of a specific example.
I also read the Name spaces in C where again the purpose of the question is more focused on a specific example, here the enum type.
My Question is:
How do name spaces work in C?
in C, the compiler is be able to disambiguate a certain use of one
object automatically if the respective identifier is used. Correct me,
if I´m wrong.
How does that work? How does the compiler know if he shall use one
entity instead of the other in C?
The excerpt from the standard already addresses this (emphasis added):
— label names (disambiguated by the syntax of the label declaration and use);
A label declaration has the form of an identifier followed by a colon, which must be followed by a statement:
a_label:
do_something;
The only use of labels is in goto statements, and the identifier in a goto statement can be only a label:
goto a_label;
— the tags of structures, unions, and enumerations (disambiguated by following any32) of the keywords struct, union, or enum);
"Following any of the keywords struct, union, or enum" means exactly what it says:
struct a_tag
union another_tag
enum a_third_tag
Those forms can appear in type definintions, type declarations, and type uses. If one of the keywords struct, union, or enum immediately precedes an identifier then that identifier is a tag; otherwise it isn't.
— the members of structures or unions; each structure or union has a
separate name space for its members(disambiguated by the type of the
expression used to access the member via the . or -> operator);
The appearance of an identifier as the right-hand operand of a . or -> operator distinguishes it as the identifier of a structure or union member. The type of the left-hand operand determines of which structure or union type. C structure and union types cannot have static members, so there is never any need to access a structure or union member relative to the type itself, absent an object of that type.
— all other identifiers, called ordinary identifiers
Anything not covered by one of the other three cases is covered by this one. That includes variable names, function names, function parameter names, built-in and typedefed type names, and enumeration constants. (I think that's a complete list, but I may have overlooked something).
How do name spaces work in C?
The only other thing I can think of to clarify is that unlike in C++, C has only implicit declaration and use of namespaces. There is no namespace keyword in C, and no syntax for explicitly referring to an identifier relative to a chosen namespace. User-defined namespaces being limited to those associated with structure and union types, the simple, implicit approach satisfactorily covers all possible cases.
Given that the category of ordinary identifiers is very broad, however, it has become conventional for authors of reusable C libraries to minimize the likelihood of name collisions by prefixing the external identifiers exposed by their libraries with characteristic short prefixes. This ad hoc namespacing is quite outside the scope of the standard, but very common.
The C 11 standard defines struct compatibility as follows (6.2.7):
Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types…
That means I can have 2 files like this:
foo.c:
struct struc {
int x;
};
int foo(struct struc *s)
{
return s->x;
}
main.c:
struct struc {
float x;
};
int foo(struct struc *s);
int main(void)
{
return foo(&(struct struc){1.2f});
}
Smells like undefined behavior (as it is for types like int and float). But if I am understanding the standard correctly (maybe I am misinterpreting the second sentence), this is allowed. If so, what is the rationale behind this? Why not also specify that structs in separate translation units must also be structurally equivalent?
Smells like undefined behavior
Because it is.
But if I am understanding the standard correctly
This doesn't seem to be the case in this particular instance.
this is allowed.
Nope. I do not see (and you do not explain) how the standard language could be interpreted this way.
The standard says
If both are completed anywhere within their respective translation units
This condition holds in your your example.
then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types
This requirement is not satisfied, so the types are not compatible.
Why not also specify that structs in separate translation units must also be structurally equivalent?
The standard specifies exactly that. "[o]ne-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types" is precisely the definition of structural equivalence.
I've seen structs declared two different ways.
typedef struct _myStruct {
...
} myStruct;
and
typedef struct myStruct {
...
} myStruct;
Is there a reason for the leading underscore or is this just a stylistic thing? If there is not a difference, is one of these preferred over the other?
The former was used long ago, when some compiler(s) didn't allow the tag and the typedef to use the same identifier. The latter is currently preferred, and in fact, identifiers that start with an underscore are discouraged.
There are reasons not to use the leading underscore, notably that names starting with an underscore are basically reserved for use by the implementation. The details are a little more nuanced than that, but it is easier to remember.
ISO/IEC 9899:2011
7.1.3 Reserved identifiers
7.1.3 Reserved identifiers
¶1 Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers.
All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.
…
Consequently, using the leading underscore is treading on thin ice. Usually, you'll get away with. However, sometimes you won't, and when you won't, you have no recourse because you've been treading outside the limits of the namespace that the standard allows you to use.
If the structure tag and the type name are the same, you don't have to guess which structure tag goes with which type name (alias).
Note that the Linux kernel coding standards reject typedefs for structures. You'll have to decide whether you want to follow that rule. Many systems do not follow it.
One other minor issue is that C++ performs the equivalent of typedef struct MyStruct MyStruct; automatically — after defining a class or struct (or union) with a tag name, you can use the tag name as a type name. It isn't identical — you can do the typedef yourself and it compiles cleanly.
Completely stylistic. Just visually differentiates the "synthetic type" from the declared variable of that type.
I tend to do :-
typedef struct {
...
} myStruct;
I see occasional questions such as "what's the difference between a declaration and a definition":
What is the difference between a definition and a declaration?
The distinction is important and intellectually it achieves two important things:
It brings to the fore the difference between reference and referent
It's how C enables separation in time of the attachment between reference and referent.
So why is a C typedef declaration not called a typedef definition?
Firstly, it's obviously a definition. It defines an alias. The new name is to be taken as referring to the existing thing. But it certainly binds the reference to a specific referent and is without doubt a defining statement.
Secondly, wouldn't it be called a typedec if it were a declaration?
Thirdly, wouldn't it avoid all those confusing questions people ask when they try and make a forward declaration using a typedef?
A typedef declaration is a definition.
N1570 6.7p5:
A declaration specifies the interpretation and attributes of a set of identifiers. A definition of an identifier is a declaration for that identifier that:
for an object, causes storage to be reserved for that object;
for a function, includes the function body;
for an enumeration constant, is the (only) declaration of the identifier;
for a typedef name, is the first (or only) declaration of the identifier.
In C99, the last two bullet points were combined; C11 introduced the ability to declare the same typedef twice.
Note that only objects, functions, enumeration constants, and typedef names can have definitions. One might argue that given:
enum foo { zero, one};
it doesn't make much sense to consider this to be a definition of zero and one, but not of foo or enum foo. On the other hand, an enum, struct, or union declaration, though it creates a type that didn't previously exist, doesn't define an identifier that is that type's name -- and for structs and union, the tag name can be used (as an incomplete type) even before the type has been defined. Definitions define identifiers, not (necessarily) the entities to which they refer.
As for why it's not called a "definition" in the subsection that defines it, it's part of section 6.7 "Declarations", which covers all kinds of declarations (some of which are also definitions). The term definition is defined in the introductory part of 6.7.
As for the name typedef, it's caused a fair amount of confusion over the years since it doesn't really define a type. Perhaps typename would have been a better choice, or even typealias. But since it does define the identifier, typedef isn't entirely misleading.
can we say that identifier are alias of variables?
are identifier and variables same?
To say it another way, identifiers are the names given to things (such as variables and functions). They identify the thing which they are naming.
No.
int f() { }
f is an identifier. It is not a variable.
Identifier is the fancy term used to mean ‘name’. In C, identifiers are used to refer to a number of things: we've already seen them used to name variables and functions. They are also used to give names to some things we haven't seen yet, amongst which are labels and the ‘tags’ of structures, unions, and enums.
An identifier is used for any variable, function, data definition, etc. In the C programming language, an identifier is a combination of alphanumeric characters, the first being a letter of the alphabet or an underline, and the remaining being any letter of the alphabet, any numeric digit, or the underline. and you know about variables.
please check C Tutorial - Chapter 1
No, from C99 (6.2.1):
An identifier can denote an object; a
function; a tag or a member of a
structure, union, or enumeration; a
typedef name; a label name; a macro
name; or a macro parameter.