Incomplete types in C - c

6.2.5
At various points within a translation unit an object type may be
incomplete (lacking sufficient information to determine the size of
objects of that type).
Also
6.2.5 19) The void type comprises an empty set of values; it is an incomplete object type that cannot be completed.
And
6.5.3.4 The sizeof operator shall not be applied to an expression that has function type or an incomplete type,
But Visual Studio 2010 prints 0 for
printf("Size of void is %d\n",sizeof(void));
My question is 'What are incomplete types'?
struct temp
{
int i;
char ch;
int j;
};
Is temp is incomplete here? If yes why it is incomplete(We know the size of temp)? Not getting clear idea of incomplete types. Any code snippet which explains this will be helpful.

Your struct temp is incomplete right up until the point where the closing brace occurs:
struct temp
{
int i;
char ch;
int j;
};// <-- here
The structure is declared (comes into existence) following the temp symbol but it's incomplete until the actual definition is finished. That's why you can have things like:
struct temp
{
int i;
char ch;
struct temp *next; // can use pointers to incomplete types.
};
without getting syntax errors.
C makes a distinction between declaration (declaring that something exists) and definition (actually defining what it is).
Another incomplete type (declared but not yet defined) is:
struct temp;
This case is often used to provide opaque types in C where the type exists (so you can declare a pointer to it) but is not defined (so you can't figure out what's in there). The definition is usually limited to the code implementing it while the header used by clients has only the incomplete declaration.

No, your struct temp example is certainly complete; Assuming int is 4 bytes, and char is 1, I can easily count 9 bytes in that struct (ignoring padding).
Another example of an incomplete type would be:
struct this_is_incomplete;
This tells the compiler, "hey, this struct exists, but you don't know what's in it yet". This is useful for information hiding, but when you need to pass a pointer to the type:
int some_function(struct this_is_incomplete* ptr);
The compiler can correctly generate calls to this function, because it knows a pointer is 4 (or 8) bytes, even though it doesn't know how big the thing is that the pointer points to.

A type can be incomplete when its name is declared but not its definition. This occurs when you forward-declare a type in a header file.
Say, record.h contains:
struct record_t;
void process_record(struct record_t *r);
And record.c contains:
struct record_t {
int data;
};
If, in another module, say "usage.c" you do this:
#include "record.h"
const int rec_size = sizeof(struct record_t); // FAIL
The type record_t is incomplete inside the "usage.c" compilation unit, because it only knows the name record_t, and not what the type is made up of.

Related

What does this pointer of type structure definition mean (in C)?

In K&R Chapter 6, a declaration is mentioned as follows:
struct{
int len;
char *str;
} *p;
I couldn't understand which structure is this pointer p pointing to and if such a pointer definition is even valid because in all other examples given in the book and the ones I have seen otherwise, when defining a pointer to a structure, the name of the structure, that is, the type being defined needs to be mentioned. For example,
struct example{
int a;
...
}s1;
and then,
struct example *ptr = &s1;
so, it is mentioned that ptr is pointing to a type struct example and not just struct.
Also, of particular interest was this:
*p->str fetches whatever str points to; *p->str++
increments str after accessing whatever it points to (just like *s++);
I couldn't follow what p is in the first place, hence, not the increment and dereference as well.
What is going on here?
Thanks in advance!
P.S. I am new here, so any feedback on the format of the question would also be appreciated.
The struct keyword works sort of like an extended version of typedef, except that you create a complex custom type (called a structure) instead of aliasing an existing type. If you only have one thing that needs to use the declared type, you do not need to provide an explicit name for the type.
The first statement you are looking at declares a structure with two fields, but does not name it. This is called an anonymous structure. The declaration does, however, provide a pointer of that type.
One possible use-case for such a declaration is when you make a header for an external library, possibly one that is not even written in C. In that case, the type of the structure may be opaque or incomplete, and you just need to have a convenient reference to some parts of it. Making the structure anonymous prevents you from being able to allocate it yourself easily, but allows you to interact with it through the pointer.
More commonly, you will see this notation used in conjunction with named or at least aliased structures. The second statement could be rewritten as
struct example { ... } s1, *ptr;
In that case, struct example *ptr = &s1; would be just ptr = &s1;.
An even more common occurrence is the use of anonymous structures with typedef, create custom type names that do not include the struct keyword. Your second example could be rewritten as
typedef struct { ... } example, *pexample;
example s1;
pexample ptr; // alternatively example *ptr;
ptr = &s1;
Note that the type of s1 is example and not struct example in this case.
For starters consider the following simple program
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
struct {
int len;
char *str;
} *p;
p = malloc( sizeof( *p ) );
p->str = "Hello Nihal Jain";
p->len = strlen( p->str );
while ( *p->str ) putchar( *p->str++ );
putchar( '\n' );
free( p );
return 0;
}
Its output is
Hello Nihal Jain
So in this declaration
struct {
int len;
char *str;
} *p;
there is declared a pointer of the type of the unnamed structure. The pointer itself is not initialized. You could write for example
struct {
int len;
char *str;
} *p = malloc( sizeof( *p ) );
For this simple program the name of the structure is not required because neither declaration of an object of the structure type is present or needed in the program.
So you can not declare an object of the structure type but it is not required in this case.
According to the C Standard structure or union are declared like
struct-or-union-specifier:
struct-or-union identifieropt { struct-declaration-list }
struct-or-union identifier
It is seen that the identifier is optional if there is a struct-declaration-list. So an unnamed structure can be used as a type specifier.
One more example with using enumerations. You can declare enumerators without declaring the enumeration type. For example
enum { EXIT_SUCCESS = 0, EXIT_FAILURE = -1 };
You can use the enumerators that have the type int without declaring an object of the enumeration type if there is no such a requirement in the program.
Also another use (of anonymous structures) would be to use them inside unions or other structs which basically limits the use of that struct to that particular parents union or structure not anything else, which is quite useful from programmer's point of view because looking at a code like this it provides us with the information that it is used locally on the context of the struct or union inside it -
nothing more than that. (also saves us from naming).
Also if you use this anonymous structure you wont be able to allocate one instance of it other than the one which already exists.
Example:-(stark's comment: how to use it?)
struct {
int a;
int b;
} p;
scanf("%d",&p.a);
If you needed an ad-hoc, one-time-use definition of a struct (or union) that will not be useful outside of scope in which it was declared, you would use what is called anonymous or unnamed structs and unions. This saves you from having to declare the struct before you use it and, in the case of complex types, declaring the inner types, such as here.

Why can't I complete a typedef name of array type?

The C standard states (§6.2.5 p22):
An array type of unknown size is an incomplete type. It is completed,
for an identifier of that type, by specifying the size in a later
declaration (with internal or external linkage).
And it works fine as far as variable declarations are concerned:
int a[];
int a[2]; //OK
But when we add typedef before those declarations the compiler complains (I also changed the name):
typedef int t[];
typedef int t[2]; //redefinition with different type
It doesn't complain however when we are completing a typedef to incomplete structure instead:
typedef struct t t1;
typedef struct t { int m; } t1; //OK
Possible use case of an incomplete typedef of array could be something like this:
int main(int n, char **pp)
{
typedef int t1[][200];
typedef struct t { t1 *m; int m1; } t0;
typedef int t1[sizeof (t0)][200];
}
In the above example I would like to declare a pointer to array inside a structure with number of elements equal to the structure size. Yes I could use a structure instead of array but why should I when the above option is potentially available?
typedef int t[2]; is not allowed because of the constraint 6.7/3:
If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type specifier) with the same scope and in the same name space, except that:
a typedef name may be redefined to denote the same type as it currently does, provided that type is not a variably modified type;
However int[] and int[2] are not the same type, so this "except" doesn't apply, and so the code violates the constraint.
Regarding your first quote: Although 6.2.5/22 says that an incomplete type can be completed, it doesn't follow that any attempted completion is automatically legal. The attempted completion must also comply with all the other rules of the language, and in this case it does not comply with 6.7/3.
The int a[]; int a[2]; example is OK (under 6.7/3) because a has linkage; and in the typedef struct t t1; , struct t is still the same type before and after its completion.
From 6.2.5p1, we can see a definition of the terms complete and incomplete:
At various points within a translation unit an object type may be incomplete (lacking sufficient information to determine the size of objects of that type) or complete (having sufficient information).
Thus, when we talk about a type being incomplete, we're really talking about the size of objects of that type being indeterminate. We can't talk about "incomplete types" without declaring an object of that type.
In your first example, the size of a is determinate as you've completed the definition for the object using the second declaration.
In your second example, no declaration for an object is made. Once the declaration is made, e.g. t x = { 1, 2 };, it becomes clear that the type isn't incomplete.
In your third example, you're not actually completing the type alias; you're completing the struct definition. You might as well have written:
typedef struct t t1;
struct t { int m; };
We can see further support for the redefinition of struct tags, and the exclusion of the VLA redefinition in 6.7p3:
If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type specifier) with the same scope and in the same name space, except that:
a typedef name may be redefined to denote the same type as it currently does, provided that type is not a variably modified type;
tags may be redeclared as specified in 6.7.2.3.

All struct identifiers are automatically forward declared

While answer warning: assignment from incompatible pointer type for linklist array, I noticed any undeclared identifier perceded with struct keyword are considered as forward declared identifiers.
For instance the program below compiles well:
/* Compile with "gcc -std=c99 -W -Wall -O2 -pedantic %" */
#include <stdio.h>
struct foo
{
struct bar *next; /* Linked list */
};
int main(void) {
struct bar *a = 0;
struct baz *b = 0;
struct foo c = {0};
printf("bar -> %p\n", (void *)a);
printf("baz -> %p\n", (void *)b);
printf("foo -> %p, %zu\n", (void *)&c, sizeof c); /* Remove %zu if compiling with -ansi flag */
return 0;
}
My question: Which rule guides a C compiler to treat undeclared struct identifiers as forward declared incomplete struct types?
The Standard says (6.2.5.28)
All pointers to structure types shall have the same representation and alignment requirements as each other.
This means the compiler knows how to represent the pointers to any structure, even those that are (yet) undefined.
Your program deals only with pointers to such structures, so it's ok.
It is described in 6.2.5 Types and 6.7.2.3 Tags.
struct identifier is an object type.
6.2.5 Types
The meaning of a value stored in an object or returned by a function is determined by the
type of the expression used to access it. (An identifier declared to be an object is the
simplest such expression; the type is specified in the declaration of the identifier.) Types
are partitioned into object types (types that describe objects) and function types (types
that describe functions). At various points within a translation unit an object type may be
incomplete (lacking sufficient information to determine the size of objects of that type) or
complete (having sufficient information). 37)
37) A type may be incomplete or complete throughout an entire translation unit, or it may change states at
different points within a translation unit.
An array type of unknown size is an incomplete type. It is completed, for an identifier of
that type, by specifying the size in a later declaration (with internal or external linkage).
A structure or union type of unknown content (as described in 6.7.2.3) is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or
union tag with its defining content later in the same scope.
6.7.2.3 Tags
All declarations of structure, union, or enumerated types that have the same scope and
use the same tag declare the same type. Irrespective of whether there is a tag or what
other declarations of the type are in the same translation unit, the type is incomplete 129)
until immediately after the closing brace of the list defining the content, and complete
thereafter.
129) An incomplete type may only by used when the size of an object of that type is not needed. It is not
needed, for example, when a typedef name is declared to be a specifier for a structure or union, or
when a pointer to or a function returning a structure or union is being declared. (See incomplete types
in 6.2.5.) The specification has to be complete before such a function is called or defined.
In addition to the answer provided by 2501, and your comment to it that "In my case, there is not even the forward declaration", the following.
Any use of a struct tag counts as a (forward) declaration of the structure type, if it had not been declared before. Although a more formal way would be to say that this simply counts as a type, since the C standard does not mention "forward declarations of structure types", just complete and incomplete structure types (6.2.5p22).
6.7.2 Type specifiers tells us that a struct-or-union-specifier is a type-specifier, and 6.7.2.1 Structure and union specifiers paragraph 1 tells us that that in turn struct identifier is a struct-or-union-specifier.
Suppose you have a linked list declaration, something like
struct node {
struct node *next;
int element;
};
then the "implicit forward declaration" of this incomplete type is essential for this structure to work. After all, the type struct node is only complete at the terminating semicolon. But you need to refer to it in order to declare the next pointer.
Also, a struct node declaration (of incomplete type) can go out of scope, just like any other declaration. This happens for instance if you have some prototype
int function(struct unknown *parameter);
where the struct unknown goes out of scope immediately at the end of the declaration. Any further declared struct unknowns are then not the same as this one. That is implied in the text of 6.2.5p22:
A structure or union type of unknown content (as described in 6.7.2.3)
is an incomplete type. It is completed, for all declarations of that
type, by declaring the same structure or union tag with its defining
content later in the same scope.
That is why gcc warns about this:
foo.c:1:21: warning: 'struct unknown' declared inside parameter list
foo.c:1:21: warning: its scope is only this definition or declaration, which is probably not what you want
You can fix this by putting an extra forward declaration before it, which makes the scope start earlier (and therefore end later):
struct unknown;
int function(struct unknown *parameter);
I think that the most elegant use-case where incomplete struct types are used is something like this :
struct foo
{
struct bar *left;
struct bar *right;
};
struct bar
{
int something;
struct foo *next;
};
I.e. double recursion, where a points to b and b points to a.
Such cases might be a reason why this feature was included in original C language specification.
Original question is whether all struct identifiers are automatically forward declared. I think that it would be better to say that all incomplete struct definitions are automatically considered as forward declaration.
Edit: Following the comment about documentation, let's look at the C language bible : Kerninghan&Ritchie - The C Programming Language, section "6.5 Self-referential Structures" says :
Occasionally, one needs a variation of self-referential structures:
two structures that refer to each other. The way to handle this is:
struct t {
...
struct s *p; /* p points to an s */
};
struct s {
...
struct t *q; /* q points to a t */
};
I agree, that it is possible to implement another way, but I would take this as good motivation from authors of the C language and I agree with them that it is elegant way to implement this.

nested struct in C

struct s{
int a;
struct s b;
};
The above code segment throws the error error: field 'b' has incomplete type while
struct s{
int a;
struct s *b;
};
doesn't give any error. I don't understand why this is allowed for pointers but not for the non-pointer variable !!
Class members must have a complete type when they are declared, so that their size can be used to determine the class layout.
Within the class definition, the class itself is incomplete, so you can't declare a member of the same type. That would be impossible anyway (at least if there are any other members), since the class would have to be larger than itself.
A pointer is a complete type, even if the type it points to isn't, so you can declare a class member to be a pointer to the class type.
(Note: I use the word "class" since I'm a C++ programmer. I just noticed that the question is also tagged C, and C++ has since been removed. I believe the answer is still correct in that language, if you replace "class" with "structure", but I'm not completely sure since they are different languages. It would be better if you only asked about one language, since there are differences (sometimes major, sometimes subtle) between languages.)
Q: What are incomplete types?
A: An incomplete type is a type which has the identifier but lacks information needed to determine the size of the identifier.
The ‘void’ type is an incomplete type.
A union/structure type whose members which are not yet specified.
‘void’ type cannot be completed.
To complete an incomplete type, we need to specify the missing
information.
Example:
struct Employee *ptr; // Here 'Employee' is incomplete
C/C++ allows pointers to incomplete types.
To make 'Employee' complete, we need to specify missing information like shown below
typedef struct Employee
{
char name[25];
int age;
int employeeID;
char department[25];
}EMP;
In your case,
struct s
{
int a;
struct s b; // Structure is incomplete.
}// Till this point the structure is incomplete.
The struct s b; the structure s is incomplete. We can declare a pointer to incomplete type not a variable.
To adequately define s, the compiler needs to know the size of s. In the first example, the size of s depends on the size of s, but not in the second.
In the first example, by defintion, sizeof(s) = sizeof(int) + sizeof(s) + padding. If we try to solve this equation for sizeof(s), we get 0 = sizeof(int) + padding, which clearly is impossible.
In the second, sizeof(s) = sizeof(int) + sizeof(s*) + padding. If we assume that sizeof(s*) ~= sizeof(int), then sizeof(s) = 2*sizeof(int) + padding.
I'm going to assume that the extra asterisks in struct s **b** are given for emphasis and not as some kind of demented pointer declaration. (Please don't do that! It's much easier to analyze someone's code if it's presented exactly as it runs.)
When you do this without declaring b as a pointer:
struct s{
int a;
struct s b;
};
the compiler doesn't know how much space it needs to allocate for the b field, since at that point you haven't finished defining struct s. In fact, it would be impossible for the compiler to define this particular structure: no matter how many bytes it allocated for struct s, it would have to add 4 more to make room for the int a field.
Declaring b to be a pointer to struct s makes things easier:
struct s{
int a;
struct s *b;
};
No matter how many fields you add to struct s, the compiler knows that the b field only needs to contain the address of the struct, and that doesn't change based on how large the struct itself it.

How is it legal to reference an undefined type inside a structure?

As part of answering another question, I came across a piece of code like this, which gcc compiles without complaint.
typedef struct {
struct xyz *z;
} xyz;
int main (void) {
return 0;
}
This is the means I've always used to construct types that point to themselves (e.g., linked lists) but I've always thought you had to name the struct so you could use self-reference. In other words, you couldn't use xyz *z within the structure because the typedef is not yet complete at that point.
But this particular sample does not name the structure and it still compiles. I thought originally there was some black magic going on in the compiler that automatically translated the above code because the structure and typedef names were the same.
But this little beauty works as well:
typedef struct {
struct NOTHING_LIKE_xyz *z;
} xyz;
What am I missing here? This seems a clear violation since there is no struct NOTHING_LIKE_xyz type defined anywhere.
When I change it from a pointer to an actual type, I get the expected error:
typedef struct {
struct NOTHING_LIKE_xyz z;
} xyz;
qqq.c:2: error: field `z' has incomplete type
Also, when I remove the struct, I get an error (parse error before "NOTHING ...).
Is this allowed in ISO C?
Update: A struct NOSUCHTYPE *variable; also compiles so it's not just inside structures where it seems to be valid. I can't find anything in the c99 standard that allows this leniency for structure pointers.
As the warning says in the second case, struct NOTHING_LIKE_xyz is an incomplete type, like void or arrays of unknown size. An incomplete type can only appear in a struct as a type pointed to (C17 6.7.2.1:3), with an exception for arrays of unknown size that are allowed as the last member of a struct, making the struct itself an incomplete type in this case. The code that follows cannot dereference any pointer to an incomplete type (for good reason).
Incomplete types can offer some datatype encapsulation of sorts in C...
The corresponding paragraph in http://www.ibm.com/developerworks/library/pa-ctypes1/ seems like a good explanation.
The parts of the C99 standard you are after are 6.7.2.3, paragraph 7:
If a type specifier of the form
struct-or-union identifier occurs
other than as part of one of the above
forms, and no other declaration of the
identifier as a tag is visible, then
it declares an incomplete structure or
union type, and declares the
identifier as the tag of that type.
...and 6.2.5 paragraph 22:
A structure or union type of unknown
content (as described in 6.7.2.3) is
an incomplete type. It is completed,
for all declarations of that type, by
declaring the same structure or union
tag with its defining content later in
the same scope.
The 1st and 2nd cases are well-defined, because the size and alignment of a pointer is known. The C compiler only needs the size and alignment info to define a struct.
The 3rd case is invalid because the size of that actual struct is unknown.
But beware that for the 1st case to be logical, you need to give a name to the struct:
// vvv
typedef struct xyz {
struct xyz *z;
} xyz;
otherwise the outer struct and the *z will be considered two different structs.
The 2nd case has a popular use case known as "opaque pointer" (pimpl). For example, you could define a wrapper struct as
typedef struct {
struct X_impl* impl;
} X;
// usually just: typedef struct X_impl* X;
int baz(X x);
in the header, and then in one of the .c,
#include "header.h"
struct X_impl {
int foo;
int bar[123];
...
};
int baz(X x) {
return x.impl->foo;
}
the advantage is out of that .c, you cannot mess with the internals of the object. It is a kind of encapsulation.
You do have to name it. In this:
typedef struct {
struct xyz *z;
} xyz;
will not be able to point to itself as z refers to some complete other type, not to the unnamed struct you just defined. Try this:
int main()
{
xyz me1;
xyz me2;
me1.z = &me2; // this will not compile
}
You'll get an error about incompatible types.
Well... All I can say is that your previous assumption was incorrect. Every time you use a struct X construct (by itself, or as a part of larger declaration), it is interpreted as a declaration of a struct type with a struct tag X. It could be a re-declaration of a previously declared struct type. Or, it can be a very first declaration of a new struct type. The new tag is declared in scope in which it appears. In your specific example it happens to be a file scope (since C language has no "class scope", as it would be in C++).
The more interesting example of this behavior is when the declaration appears in function prototype:
void foo(struct X *p); // assuming `struct X` has not been declared before
In this case the new struct X declaration has function-prototype scope, which ends at the end of the prototype. If you declare a file-scope struct X later
struct X;
and try to pass a pointer of struct X type to the above function, the compiler will give you a diagnostics about non-matching pointer type
struct X *p = 0;
foo(p); // different pointer types for argument and parameter
This also immediately means that in the following declarations
void foo(struct X *p);
void bar(struct X *p);
void baz(struct X *p);
each struct X declaration is a declaration of a different type, each local to its own function prototype scope.
But if you pre-declare struct X as in
struct X;
void foo(struct X *p);
void bar(struct X *p);
void baz(struct X *p);
all struct X references in all function prototype will refer to the same previosly declared struct X type.
I was wondering about this too. Turns out that the struct NOTHING_LIKE_xyz * z is forward declaring struct NOTHING_LIKE_xyz. As a convoluted example,
typedef struct {
struct foo * bar;
int j;
} foo;
struct foo {
int i;
};
void foobar(foo * f)
{
f->bar->i;
f->bar->j;
}
Here f->bar refers to the type struct foo, not typedef struct { ... } foo. The first line will compile fine, but the second will give an error. Not much use for a linked list implementation then.
When a variable or field of a structure type is declared, the compiler has to allocate enough bytes to hold that structure. Since the structure may require one byte, or it may require thousands, there's no way for the compiler to know how much space it needs to allocate. Some languages use multi-pass compilers which would be able find out the size of the structure on one pass and allocate the space for it on a later pass; since C was designed to allow for single-pass compilation, however, that isn't possible. Thus, C forbids the declaration of variables or fields of incomplete structure types.
On the other hand, when a variable or field of a pointer-to-structure type is declared, the compiler has to allocate enough bytes to hold a pointer to the structure. Regardless of whether the structure takes one byte or a million, the pointer will always require the same amount of space. Effectively, the compiler can tread the pointer to the incomplete type as a void* until it gets more information about its type, and then treat it as a pointer to the appropriate type once it finds out more about it. The incomplete-type pointer isn't quite analogous to void*, in that one can do things with void* that one can't do with incomplete types (e.g. if p1 is a pointer to struct s1, and p2 is a pointer to struct s2, one cannot assign p1 to p2) but one can't do anything with a pointer to an incomplete type that one could not do to void*. Basically, from the compiler's perspective, a pointer to an incomplete type is a pointer-sized blob of bytes. It can be copied to or from other similar pointer-sized blobs of bytes, but that's it. the compiler can generate code to do that without having to know what anything else is going to do with the pointer-sized blobs of bytes.

Resources