While answer warning: assignment from incompatible pointer type for linklist array, I noticed any undeclared identifier perceded with struct keyword are considered as forward declared identifiers.
For instance the program below compiles well:
/* Compile with "gcc -std=c99 -W -Wall -O2 -pedantic %" */
#include <stdio.h>
struct foo
{
struct bar *next; /* Linked list */
};
int main(void) {
struct bar *a = 0;
struct baz *b = 0;
struct foo c = {0};
printf("bar -> %p\n", (void *)a);
printf("baz -> %p\n", (void *)b);
printf("foo -> %p, %zu\n", (void *)&c, sizeof c); /* Remove %zu if compiling with -ansi flag */
return 0;
}
My question: Which rule guides a C compiler to treat undeclared struct identifiers as forward declared incomplete struct types?
The Standard says (6.2.5.28)
All pointers to structure types shall have the same representation and alignment requirements as each other.
This means the compiler knows how to represent the pointers to any structure, even those that are (yet) undefined.
Your program deals only with pointers to such structures, so it's ok.
It is described in 6.2.5 Types and 6.7.2.3 Tags.
struct identifier is an object type.
6.2.5 Types
The meaning of a value stored in an object or returned by a function is determined by the
type of the expression used to access it. (An identifier declared to be an object is the
simplest such expression; the type is specified in the declaration of the identifier.) Types
are partitioned into object types (types that describe objects) and function types (types
that describe functions). At various points within a translation unit an object type may be
incomplete (lacking sufficient information to determine the size of objects of that type) or
complete (having sufficient information). 37)
37) A type may be incomplete or complete throughout an entire translation unit, or it may change states at
different points within a translation unit.
An array type of unknown size is an incomplete type. It is completed, for an identifier of
that type, by specifying the size in a later declaration (with internal or external linkage).
A structure or union type of unknown content (as described in 6.7.2.3) is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or
union tag with its defining content later in the same scope.
6.7.2.3 Tags
All declarations of structure, union, or enumerated types that have the same scope and
use the same tag declare the same type. Irrespective of whether there is a tag or what
other declarations of the type are in the same translation unit, the type is incomplete 129)
until immediately after the closing brace of the list defining the content, and complete
thereafter.
129) An incomplete type may only by used when the size of an object of that type is not needed. It is not
needed, for example, when a typedef name is declared to be a specifier for a structure or union, or
when a pointer to or a function returning a structure or union is being declared. (See incomplete types
in 6.2.5.) The specification has to be complete before such a function is called or defined.
In addition to the answer provided by 2501, and your comment to it that "In my case, there is not even the forward declaration", the following.
Any use of a struct tag counts as a (forward) declaration of the structure type, if it had not been declared before. Although a more formal way would be to say that this simply counts as a type, since the C standard does not mention "forward declarations of structure types", just complete and incomplete structure types (6.2.5p22).
6.7.2 Type specifiers tells us that a struct-or-union-specifier is a type-specifier, and 6.7.2.1 Structure and union specifiers paragraph 1 tells us that that in turn struct identifier is a struct-or-union-specifier.
Suppose you have a linked list declaration, something like
struct node {
struct node *next;
int element;
};
then the "implicit forward declaration" of this incomplete type is essential for this structure to work. After all, the type struct node is only complete at the terminating semicolon. But you need to refer to it in order to declare the next pointer.
Also, a struct node declaration (of incomplete type) can go out of scope, just like any other declaration. This happens for instance if you have some prototype
int function(struct unknown *parameter);
where the struct unknown goes out of scope immediately at the end of the declaration. Any further declared struct unknowns are then not the same as this one. That is implied in the text of 6.2.5p22:
A structure or union type of unknown content (as described in 6.7.2.3)
is an incomplete type. It is completed, for all declarations of that
type, by declaring the same structure or union tag with its defining
content later in the same scope.
That is why gcc warns about this:
foo.c:1:21: warning: 'struct unknown' declared inside parameter list
foo.c:1:21: warning: its scope is only this definition or declaration, which is probably not what you want
You can fix this by putting an extra forward declaration before it, which makes the scope start earlier (and therefore end later):
struct unknown;
int function(struct unknown *parameter);
I think that the most elegant use-case where incomplete struct types are used is something like this :
struct foo
{
struct bar *left;
struct bar *right;
};
struct bar
{
int something;
struct foo *next;
};
I.e. double recursion, where a points to b and b points to a.
Such cases might be a reason why this feature was included in original C language specification.
Original question is whether all struct identifiers are automatically forward declared. I think that it would be better to say that all incomplete struct definitions are automatically considered as forward declaration.
Edit: Following the comment about documentation, let's look at the C language bible : Kerninghan&Ritchie - The C Programming Language, section "6.5 Self-referential Structures" says :
Occasionally, one needs a variation of self-referential structures:
two structures that refer to each other. The way to handle this is:
struct t {
...
struct s *p; /* p points to an s */
};
struct s {
...
struct t *q; /* q points to a t */
};
I agree, that it is possible to implement another way, but I would take this as good motivation from authors of the C language and I agree with them that it is elegant way to implement this.
Related
I'm trying to pass a struct by reference but no matter what I do I'm running into errors. I think I have the prototyping and declaration and the pointers all screwed up.
This is for an Arduino project of mine. The code works fine on the Arduino compiler but doesn't compile on Pelles C compiler.
#include <stdio.h>
#include <string.h>
#include <stdint.h>
void Fault_Bits_To_Flags(uint8_t Master_Fault_Byte, struct Fault_Flag_Struct *Fault_Flag);
struct Fault_Flag_Struct {
char Fault_Name[30];
uint8_t Fault_State;
};
struct Fault_Flag_Struct Fault_Flag [7];
int main(void) {
uint8_t Master_Fault_Byte = 181;
strcpy(Fault_Flag[0].Fault_Name, "fault 0");
Fault_Flag[0].Fault_State = 1;
strcpy(Fault_Flag[1]....
strcpy(Fault_Flag[2]....
strcpy(Fault_Flag[3]....
strcpy(Fault_Flag[4]....
Fault_Bits_To_Flags( Master_Fault_Byte, *Fault_Flag);
return 0;
}
//Puts 8 bits from single byte into 8 separate bytes (flags)//
void Fault_Bits_To_Flags(uint8_t Master_Fault_Byte, struct Fault_Flag_Struct *Fault_Flag) {
for ( int i = 0; i < 8; i++ )
{
Fault_Flag[i].Fault_State = (Master_Fault_Byte >> i) & 1;
}
}
error #2140: Type error in argument 2 to 'Fault_Bits_To_Flags';
expected '(incomplete) struct Fault_Flag_Struct *' but found 'struct
Fault_Flag_Struct'.
error #2120: Redeclaration of 'Fault_Bits_To_Flags', previously
declared at Reference.c(4); expected 'void function(unsigned char,
(incomplete) struct Fault_Flag_Struct *)' but found 'void
function(unsigned char, struct Fault_Flag_Struct )'.
Error code: 1 *
The scope of a parameter declaration in the list of parameters of a function prototype terminates at the end of the function declarator. Opposite to C++ there is no such a notion as the elaborated type specifier in C.
From the C Standard (6.2.1 Scopes of identifiers)
... If the declarator or type specifier that declares the identifier appears
within the list of parameter declarations in a
function prototype (not part of a function definition), the identifier
has function prototype scope, which terminates at the end of the
function declarator
So the type specifier struct Fault_Flag_Struct used in the function prototype
void Fault_Bits_To_Flags(uint8_t Master_Fault_Byte, struct
Fault_Flag_Struct *Fault_Flag);
denotes a different entity compared with the declaration that follows the function prototype
struct Fault_Flag_Struct {
char Fault_Name[30];
uint8_t Fault_State;
};
So you have to exchange the placements of the declarations.
Also this call
Fault_Bits_To_Flags( Master_Fault_Byte, *Fault_Flag);
is invalid because the type of the expression *Fault_Flag is struct Fault_Flag_Struct while the function expects the type struct Fault_Flag_Struct *. That is instead of a pointer to an object of the type struct Fault_Flag_Struct you are passing the object itself.
Scope and Structure Types
There are two problems in the code shown.
The first is because C has some odd rules about structure definitions. One rule, in C 2018 6.7.2.3 4, is that structure declarations with the same tag (the name after struct) declare the same type (a structure type with that name) only if they have the same scope:
All declarations of structure, union, or enumerated types that have the same scope and use the same tag declare the same type.…
When you declare a structure inside a function declaration, like this:
void foo(struct X *p);
Then the scope of X is function prototype scope. Per 6.2.1 4, this scope ends at the end of the function declaration. Then, when you later define the structure, as with:
struct X { int q; }
it is in a different scope, and, per the rule above, the struct X in the function declaration is not the same type as the struct X in the later definition. One way to fix this is to move the structure definition prior to the function declaration. It also suffices merely to declare the structure tag prior to the function declaration, as with:
struct X;
void foo(struct X *p);
To fully understand what is happening here, we should consider two other issues. One issue is that we could have struct X in two different translation units (different source files compiled separately), and calling a function defined with a struct X * parameter in one unit from another unit that defines struct X is allowed. This is because that, although the two struct X types in the two translations units are different, they are compatible. 6.2.7 1 says:
… Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if…
Oddly, this rule only applies to structures declared in separate translation units. If we defined void foo(struct X *p { … } prior to defining struct X in one translation unit, they are different and incompatible types, but, if we define them in separate units, they are compatible types!
The second issue is how can this code work when the structure declarations have separate scopes:
struct X;
void foo(struct X *p);
The first struct X has file scope (per 6.2.1 4), and the second struct X has function prototype scope. The rule in 6.7.2.3 4 only applies if the declarations have the same scope, so it does not say these declare the same struct X. Instead, there is another rule, in 6.7.2.3 9:
If a type specifier of the form struct-or-union identifier or enum identifier occurs other than as part of one of the above forms, and a declaration of the identifier as a tag is visible, then it specifies the same type as that other declaration, and does not redeclare the tag.
(The “above forms” are definitions or stand-alone declarations.) This causes the struct X in the function declaration after a prior file-scope struct X to specify the same type.
Error in Argument
The second error is in the second argument passed to the function in this statement:
Fault_Bits_To_Flags( Master_Fault_Byte, *Fault_Flag);
Fault_Flag is an array, so *Fault_Flag is the first element of the array. This is a structure, not a pointer. To pass a pointer to the first element of the array, use:
Fault_Bits_To_Flags( Master_Fault_Byte, Fault_Flag);
The C standard states (§6.2.5 p22):
An array type of unknown size is an incomplete type. It is completed,
for an identifier of that type, by specifying the size in a later
declaration (with internal or external linkage).
And it works fine as far as variable declarations are concerned:
int a[];
int a[2]; //OK
But when we add typedef before those declarations the compiler complains (I also changed the name):
typedef int t[];
typedef int t[2]; //redefinition with different type
It doesn't complain however when we are completing a typedef to incomplete structure instead:
typedef struct t t1;
typedef struct t { int m; } t1; //OK
Possible use case of an incomplete typedef of array could be something like this:
int main(int n, char **pp)
{
typedef int t1[][200];
typedef struct t { t1 *m; int m1; } t0;
typedef int t1[sizeof (t0)][200];
}
In the above example I would like to declare a pointer to array inside a structure with number of elements equal to the structure size. Yes I could use a structure instead of array but why should I when the above option is potentially available?
typedef int t[2]; is not allowed because of the constraint 6.7/3:
If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type specifier) with the same scope and in the same name space, except that:
a typedef name may be redefined to denote the same type as it currently does, provided that type is not a variably modified type;
However int[] and int[2] are not the same type, so this "except" doesn't apply, and so the code violates the constraint.
Regarding your first quote: Although 6.2.5/22 says that an incomplete type can be completed, it doesn't follow that any attempted completion is automatically legal. The attempted completion must also comply with all the other rules of the language, and in this case it does not comply with 6.7/3.
The int a[]; int a[2]; example is OK (under 6.7/3) because a has linkage; and in the typedef struct t t1; , struct t is still the same type before and after its completion.
From 6.2.5p1, we can see a definition of the terms complete and incomplete:
At various points within a translation unit an object type may be incomplete (lacking sufficient information to determine the size of objects of that type) or complete (having sufficient information).
Thus, when we talk about a type being incomplete, we're really talking about the size of objects of that type being indeterminate. We can't talk about "incomplete types" without declaring an object of that type.
In your first example, the size of a is determinate as you've completed the definition for the object using the second declaration.
In your second example, no declaration for an object is made. Once the declaration is made, e.g. t x = { 1, 2 };, it becomes clear that the type isn't incomplete.
In your third example, you're not actually completing the type alias; you're completing the struct definition. You might as well have written:
typedef struct t t1;
struct t { int m; };
We can see further support for the redefinition of struct tags, and the exclusion of the VLA redefinition in 6.7p3:
If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type specifier) with the same scope and in the same name space, except that:
a typedef name may be redefined to denote the same type as it currently does, provided that type is not a variably modified type;
tags may be redeclared as specified in 6.7.2.3.
I was working with UEFI driver-related code, and I came across this:
/* EFI headers define EFI_HANDLE as a void pointer, which renders type
* checking somewhat useless. Work around this bizarre sabotage
* attempt by redefining EFI_HANDLE as a pointer to an anonymous
* structure.
*/
#define EFI_HANDLE STUPID_EFI_HANDLE
#include <ipxe/efi/Uefi/UefiBaseType.h>
#undef EFI_HANDLE
typedef struct {} *EFI_HANDLE;
The full source code is in this path
http://dox.ipxe.org/include_2ipxe_2efi_2efi_8h_source.html
This is my first encounter with anonymous structure, and I couldn't make out the logic of redefining a void * to a pointer to an anonymous structure. What kind of a hack the "bizzare sabotage attempt" hints at?
The library is using information hiding on the internal data object behind the address held in an EFI_HANDLE. But in doing so, they're making the code more susceptible to accidental bugs.
In C, void* is transparently cast to any other non-void* non-const data pointer type without warning (it's by language design).
Using a non-void pointer type ensures an EFI_HANDLE is only used where EFI_HANDLE belongs. The compiler's type-checking kicks you in the groin when you pass it somewhere else that isn't EFI_HANDLE , but rather a pointer to something else.
Ex: As void*, this will compile without warning or error
#include <string.h>
#define EFI_HANDLE void*
int main()
{
EFI_HANDLE handle = NULL;
strcpy(handle, "Something");
}
Changing the alias to:
typedef struct {} *EFI_HANDLE;
will reap the ensuing "incompatible pointer type" compile-time error.
Finally, as an anonymous struct, there is no pointless structure tag name adding to the already-polluted name space that you can use (accidently or nefariously).
That isn't an anonymous structure, but a struct without a tag.
An anonymous structure can only exist as a member of another struct,
and it must also not have a tag1.
Defining a struct without any members is not allowed. The code you're looking at is using a compiler extension that permits this.
The library is doing this to hide the definition of the structure from the user, while maintaining type safety.
However there is a much better way to do this. If you have a hidden structure definition, you can still define an opaque pointer to it, that has a type, so it is type safe:
struct hidden //defined in a file and not exposed
{
int a;
};
void Hidden( struct hidden* );
void Other( struct other* );
struct hidden* a = NULL; //doesn't see the definition of struct hidden
Hidden( a ); //it may be used
Other( a ); //compiler error
1 (Quoted from: ISO/IEC 9899:201x 6.7.2.1 Structure and union specifiers 13)
An unnamed member whose type specifier is a structure specifier with no tag is called an
anonymous structure; an unnamed member whose type specifier is a union specifier with
no tag is called an anonymous union. The members of an anonymous structure or union
are considered to be members of the containing structure or union. This applies
recursively if the containing structure or union is also anonymous
As the title says, I have this code:
typedef struct Book{
int id;
char title[256];
char summary[2048];
int numberOfAuthors;
struct Author *authors;
};
typedef struct Author{
char firstName[56];
char lastName[56];
};
typedef struct Books{
struct Book *arr;
int numberOfBooks;
};
I get these errors from gcc :
bookstore.c:8:2: error: unknown type name ‘Author’
bookstore.c:9:1: warning: useless storage class specifier in empty declaration [enabled by default]
bookstore.c:15:1: warning: useless storage class specifier in empty declaration [enabled by default]
bookstore.c:21:2: error: unknown type name ‘Book’
bookstore.c:23:1: warning: useless storage class specifier in empty declaration [enabled by default]
No warnings and no errors occur if I change the typedefs like this:
typedef struct{
char firstName[56];
char lastName[56];
} Author;
Having searched through C Programming Language, 2nd Edition and googled for a couple of hours, I can't figure out why the first implementation won't work.
There are several things going on here. First, as others have said, the compiler's complaint about unknown type may be because you need to declare the types before using them. More important though is to understand the syntax of 3 things:
definition of struct type,
definition and declaration of struct variable, and
typedef
(Note that in the C-programming language, definition and declaration usually happen at the same time, and thus are essentially the same. This is not the case in many other languages. See footnote below for further details.)
When defining a struct, the struct can be tagged (named), or untagged (if untagged, then the struct must be used immediately (will explain what this means further below)).
struct Name {
...
};
This defines a type called "struct Name" which then can be used to define a struct variable/instance:
struct Name myNameStruct;
This defines a variable called myNameStruct which is a struct of type struct Name.
You can also define a struct, and declare/define a struct variable at the same time:
struct Name {
...
} myNameStruct;
As before, this defines a variable called myNameStruct which is an instance of type struct Name ... But it does it at the same time it defines the type struct Name.
The type can then be used again to declare and define another variable:
struct Name myOtherNameStruct;
Now typedef is just a way to alias a type with a specific name:
typedef OldTypeName NewTypeName;
Given the above typedef, any time you use NewTypeName it is the same as using OldTypeName. In the C programming language this is particularly useful with structs, because it gives you the ability to leave off the word "struct" when declaring and defining variables of that type and to treat the struct's name simply as a type on its own (as we do in C++). Here is an example that first defines the struct, and then typedefs the struct:
struct Name {
...
};
typedef struct Name Name_t;
In the above OldTypeName is struct Name and NewTypeName is Name_t. So now, to define a variable of type struct Name, instead of writing:
struct Name myNameStruct;
I can simple write:
Name_t myNameStruct;
NOTE ALSO, the typedef CAN BE COMBINED with the struct definition, and this is what you are doing in your code:
typedef struct {
...
} Name_t;
This can also be done while tagging (naming) the struct. This is useful for self-referential structs (for example linked-list nodes), but is otherwise superfluous. None-the-less, many follow the practice of always tagging structs, as in this example:
typedef struct Name {
...
} Name_t;
NOTE WELL: In the syntax above, since you have started with "typedef" then the whole statement is a typedef statement, in which the OldTypeName happens to be a struct definition. Therefore the compiler interprets the name coming after the right curly brace } as the NewTypeName ... it is NOT the variable name (as it would be in the syntax without typedef, in which case you would be defining the struct and declaring/defining a struct variable at the same time).
Furthermore, if you state typedef, but leave off the Name_t at then end, then you have effectively created an INCOMPLETE typedef statement, because the compiler considers everything within "struct Name { ... }" as OldTypeName, and you are not providing a NewTypeName for the typedef. This is why the compiler is not happy with the code as you have written it (although the compiler's messages are rather cryptic because it's not quite sure what you did wrong).
Now, as I noted above, if you do not tag (name) the struct type at the time you define it, then you must use it immediately, either to define a variable:
struct {
...
} myNameStruct; // defines myNameStruct as a variable with this struct
// definition, but the struct definition cannot be re-used.
Or you can use an untagged struct type inside a typedef:
typedef struct {
...
} Name_t;
This final syntax is what you actually did when you wrote:
typedef struct{
char firstName[56];
char lastName[56];
} Author;
And the compiler was happy. HTH.
Regarding the comment/question about the _t suffix:
_t suffix is a convention, to indicate to people reading the code that the symbolic name with the _t is a Type name (as opposed to a variable name). The compiler does not parse, nor is it aware of, the _t.
The C89, and particularly the C99, standard libraries defined many types AND CHOSE TO USE the _t for the names of those types. For example C89 standard defines wchar_t, off_t, ptrdiff_t. The C99 standard defines a lot of extra types, such as uintptr_t, intmax_t, int8_t, uint_least16_t, uint_fast32_t, etc. But _t is not reserved, nor specially parsed, nor noticed by the compiler, it is merely a convention that is good to follow when you are defining new types (via typedef) in C. In C++ many people use the convention to start type names with an uppercase, for example, MyNewType ( as opposed to the C convention my_new_type_t ). HTH
Footnote about the differences between declaring and defining: First a special thanks to #CJM for suggesting clarifying edits, particularly in relation to the use of these terms.
The following items are typically declared and defined: types, variables, and functions.
Declaring gives the compiler only a symbolic name and a "type" for that symbolic name.
For example, declaring a variable tells the compiler the name of that variable, and its type.
Defining gives the complier the full details of an item:
In the case of a type, defining gives the compiler both a name, and the detailed structure for that type.
In the case of a variable, defining tells the compiler to allocate memory (where and how much) to create an instance of that variable.
Generally speaking, in a program made up of multiple files, the variables, types and functions may be declared in many files, but each may have only one definition.
In many programming languages (for example C++) declaration and definition are easily separated. This permits "forward declaration" of types, variables, and functions, which can allow files to compile without the need for these items to be defined until later. In the C programming language however declaration and definition of variables are one and the same. (The only exception, that I know of, in the C programming language, is the use of keyword extern to allow a variable to be declared without being defined.)
It is for this reason that in a previous edit of this answer I referred to "definition of structs" and "declaration of struct [variables]," where the meaning of "declaration of a struct [variable]" was understood to be creating an instance (variable) of that struct.
The syntax is of typedef is as follow:
typedef old_type new_type
In your first try, you defined the struct Book type and not Book. In other word, your data type is called struct Book and not Book.
In the second form, you used the right syntax of typedef, so the compiler recognizes the type called Book.
Want to add by clarifying when you actually declare a variable.
struct foo {
int a;
} my_foo;
defines foo and immediately declares a variable my_foo of the struct foo type, meaning you can use it like this my_foo.a = 5;
However, because typedef syntax follows typedef <oldname> <newname>
typedef struct bar {
int b;
} my_bar;
is not declaring a variable my_bar of type struct bar, my_bar.b = 5; is illegal. It is instead giving a new name to the struct bar type in the form of my_bar. You can now declare the struct bar type with my_bar like this:
my_bar some_bar;
The other answers are all correct and useful, but maybe longer that necessary. Do this:
typedef struct Book Book;
typedef struct Books Books;
typedef struct Author Author;
struct Book {
... as you wish ...
};
struct Author {
... as you wish ...
};
struct Books {
... as you wish ...
};
You can define the your struct's in any order provided they only contain pointers to other struct's.
You just need to define Author before defining Book.
You use Author in Book so it needs to be defined before.
I think is going to help you understand.
http://www.tutorialspoint.com/cprogramming/c_typedef.htm
bookstore.c:8:2: error: unknown type name ‘Author’
bookstore.c:21:2: error: unknown type name ‘Book’
These are produced because you have to define them before you use them. Move the struct "Author" & "Books" above the struct "Book". This will solve it.
Also the warning you are getting explains why there is a problem, the compiler identifies "typedef struct Author" as not necessary because you are not properly typedef the struct so there is nothing useful for the compiler to "read".
Since you already know the answer should be in this form
typedef struct {
...
...
...
} struct-name;
stick with that.
As part of answering another question, I came across a piece of code like this, which gcc compiles without complaint.
typedef struct {
struct xyz *z;
} xyz;
int main (void) {
return 0;
}
This is the means I've always used to construct types that point to themselves (e.g., linked lists) but I've always thought you had to name the struct so you could use self-reference. In other words, you couldn't use xyz *z within the structure because the typedef is not yet complete at that point.
But this particular sample does not name the structure and it still compiles. I thought originally there was some black magic going on in the compiler that automatically translated the above code because the structure and typedef names were the same.
But this little beauty works as well:
typedef struct {
struct NOTHING_LIKE_xyz *z;
} xyz;
What am I missing here? This seems a clear violation since there is no struct NOTHING_LIKE_xyz type defined anywhere.
When I change it from a pointer to an actual type, I get the expected error:
typedef struct {
struct NOTHING_LIKE_xyz z;
} xyz;
qqq.c:2: error: field `z' has incomplete type
Also, when I remove the struct, I get an error (parse error before "NOTHING ...).
Is this allowed in ISO C?
Update: A struct NOSUCHTYPE *variable; also compiles so it's not just inside structures where it seems to be valid. I can't find anything in the c99 standard that allows this leniency for structure pointers.
As the warning says in the second case, struct NOTHING_LIKE_xyz is an incomplete type, like void or arrays of unknown size. An incomplete type can only appear in a struct as a type pointed to (C17 6.7.2.1:3), with an exception for arrays of unknown size that are allowed as the last member of a struct, making the struct itself an incomplete type in this case. The code that follows cannot dereference any pointer to an incomplete type (for good reason).
Incomplete types can offer some datatype encapsulation of sorts in C...
The corresponding paragraph in http://www.ibm.com/developerworks/library/pa-ctypes1/ seems like a good explanation.
The parts of the C99 standard you are after are 6.7.2.3, paragraph 7:
If a type specifier of the form
struct-or-union identifier occurs
other than as part of one of the above
forms, and no other declaration of the
identifier as a tag is visible, then
it declares an incomplete structure or
union type, and declares the
identifier as the tag of that type.
...and 6.2.5 paragraph 22:
A structure or union type of unknown
content (as described in 6.7.2.3) is
an incomplete type. It is completed,
for all declarations of that type, by
declaring the same structure or union
tag with its defining content later in
the same scope.
The 1st and 2nd cases are well-defined, because the size and alignment of a pointer is known. The C compiler only needs the size and alignment info to define a struct.
The 3rd case is invalid because the size of that actual struct is unknown.
But beware that for the 1st case to be logical, you need to give a name to the struct:
// vvv
typedef struct xyz {
struct xyz *z;
} xyz;
otherwise the outer struct and the *z will be considered two different structs.
The 2nd case has a popular use case known as "opaque pointer" (pimpl). For example, you could define a wrapper struct as
typedef struct {
struct X_impl* impl;
} X;
// usually just: typedef struct X_impl* X;
int baz(X x);
in the header, and then in one of the .c,
#include "header.h"
struct X_impl {
int foo;
int bar[123];
...
};
int baz(X x) {
return x.impl->foo;
}
the advantage is out of that .c, you cannot mess with the internals of the object. It is a kind of encapsulation.
You do have to name it. In this:
typedef struct {
struct xyz *z;
} xyz;
will not be able to point to itself as z refers to some complete other type, not to the unnamed struct you just defined. Try this:
int main()
{
xyz me1;
xyz me2;
me1.z = &me2; // this will not compile
}
You'll get an error about incompatible types.
Well... All I can say is that your previous assumption was incorrect. Every time you use a struct X construct (by itself, or as a part of larger declaration), it is interpreted as a declaration of a struct type with a struct tag X. It could be a re-declaration of a previously declared struct type. Or, it can be a very first declaration of a new struct type. The new tag is declared in scope in which it appears. In your specific example it happens to be a file scope (since C language has no "class scope", as it would be in C++).
The more interesting example of this behavior is when the declaration appears in function prototype:
void foo(struct X *p); // assuming `struct X` has not been declared before
In this case the new struct X declaration has function-prototype scope, which ends at the end of the prototype. If you declare a file-scope struct X later
struct X;
and try to pass a pointer of struct X type to the above function, the compiler will give you a diagnostics about non-matching pointer type
struct X *p = 0;
foo(p); // different pointer types for argument and parameter
This also immediately means that in the following declarations
void foo(struct X *p);
void bar(struct X *p);
void baz(struct X *p);
each struct X declaration is a declaration of a different type, each local to its own function prototype scope.
But if you pre-declare struct X as in
struct X;
void foo(struct X *p);
void bar(struct X *p);
void baz(struct X *p);
all struct X references in all function prototype will refer to the same previosly declared struct X type.
I was wondering about this too. Turns out that the struct NOTHING_LIKE_xyz * z is forward declaring struct NOTHING_LIKE_xyz. As a convoluted example,
typedef struct {
struct foo * bar;
int j;
} foo;
struct foo {
int i;
};
void foobar(foo * f)
{
f->bar->i;
f->bar->j;
}
Here f->bar refers to the type struct foo, not typedef struct { ... } foo. The first line will compile fine, but the second will give an error. Not much use for a linked list implementation then.
When a variable or field of a structure type is declared, the compiler has to allocate enough bytes to hold that structure. Since the structure may require one byte, or it may require thousands, there's no way for the compiler to know how much space it needs to allocate. Some languages use multi-pass compilers which would be able find out the size of the structure on one pass and allocate the space for it on a later pass; since C was designed to allow for single-pass compilation, however, that isn't possible. Thus, C forbids the declaration of variables or fields of incomplete structure types.
On the other hand, when a variable or field of a pointer-to-structure type is declared, the compiler has to allocate enough bytes to hold a pointer to the structure. Regardless of whether the structure takes one byte or a million, the pointer will always require the same amount of space. Effectively, the compiler can tread the pointer to the incomplete type as a void* until it gets more information about its type, and then treat it as a pointer to the appropriate type once it finds out more about it. The incomplete-type pointer isn't quite analogous to void*, in that one can do things with void* that one can't do with incomplete types (e.g. if p1 is a pointer to struct s1, and p2 is a pointer to struct s2, one cannot assign p1 to p2) but one can't do anything with a pointer to an incomplete type that one could not do to void*. Basically, from the compiler's perspective, a pointer to an incomplete type is a pointer-sized blob of bytes. It can be copied to or from other similar pointer-sized blobs of bytes, but that's it. the compiler can generate code to do that without having to know what anything else is going to do with the pointer-sized blobs of bytes.