I have a question regarding structures definition and pointers.
In the definition of linked list node structure we define the structure as follows:
typedef struct node
{
int data;
struct node *next;
}Node;
Why whould we use this way of declaration instead of:
typedef struct node
{
int data;
struct node next; //changed this line
}Node;
Thanks in advance!
A structure is defined after its closing brace. Until it a structure has an incomplete type. But a definition of a structure requires that all its members except a flexible array shall have complete types.
So in this declaration
typedef struct node
{
int data;
struct node next; //changed this line
}Node;
the data member next has an incomplete type.
From the C Standard (6.7.2.1 Structure and union speciļ¬ers)
...The type is incomplete until immediately after the } that terminates
the list, and complete thereafter.
and
3 A structure or union shall not contain a member with incomplete or
function type (hence, a structure shall not contain an instance of
itself, but may contain a pointer to an instance of itself), except
that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union
containing, possibly recursively, a member that is such a structure)
shall not be a member of a structure or an element of an array.
As for pointers then they always have complete types because their sizes are known.
Actually we do that because, we avoid the recursive call. Suppose think your second case. You call node inside a node itself. So what is the sizeof the node. sizeof(int) + sizeof(node). Then again for the node the size become sizeof(int)+sizeof(node). So this is a unstoppable recursive process. So we use the first case because avoid the recursive process. It just point to the object of same type structure.
The compiler needs to determine the size of whatever it is that comes its way.
This needs completeness in definition, so that a determined calculation can be made.
In the first case of self referential structure, we have a pointer. A pointer is os struct node * type has a definite size determined by the architecture.
In the second case - struct node next, what would be the size of next? Would it be the size of struct node? Okay, let's say it is, but then again - what is the size of struct node? Well, the answer is sizeof(int) + sizeof(struct node). Okay, but then again... wait...what??... go back and read this entire para again, and realize the catch-22 situation here...
The compilers won't and don't appreciate this!
Related
I am used to code like below for long.
But how does C compiler resolve the circular definition issue? Or does that issue really exist?
struct node {
int data;
struct node *next; // circular definition for "struct node" type?
};
ADD 1
Or, on a 32-bit machine, can I somewhat treat struct node * next member just as a member of 32-bit unsigned integer type? That makes me feel better.
ADD 2
The reason I think of circular definition is, when compiler encounters something like next -> data or next -> next, it has to know the exact offset of each member to add to the next pointer to get the correct location of each member. And that kind of calculation requires knowledge of each member's type. So for the member next, the circular definition issue may arise:
The type of next is struct node *, the struct node type
contains a next, the type of next is struct node *...
ADD 3
And how does the compiler calculate the sizeof(struct node)?
ADD 4
Well, I think the critical concept to understand this seemingly circular issue is, a pointer's size is not relevant to what type it points to. And the size is fixed on a specific machine. A pointer's type is only meaningful at compile-time for the compiler to generate instructions for pointer calculation.
next is a struct node *, which is just a pointer, not a struct node, so there's no actual circular definition. The details of a struct aren't required to figure out how to make a pointer to it.
To address your addenda:
Although that's more or less accurate and will probably work, it's not guaranteed by the standard, and you shouldn't do it.
Again, struct node and struct node * are entirely different types. A struct node * doesn't contain any other objects.
Let me answer your questions in order:
But how does C compiler resolve the circular definition issue?
struct node *next; is not a circular definition, struct node is an incomplete type, next is defined as a member with type struct node *, which is just a pointer.
Or does that issue really exist?
No, not really an issue. As a matter of fact, you could have members defined as pointers to any undefined structure.
Or, on a 32-bit machine, can I somewhat treat struct node *next member just as a member of 32-bit unsigned integer type?
You should not make any assumptions on the actual size, offset or alignment of structure members. A 32-bit machine could have pointers sizes that are not 32 bits, it does not matter as long as you use them as pointers, not integers.
The reason I think of circular definition is, when compiler encounters something like next->data or next->next, it has to know the exact offset of each member to add to the next pointer to get the correct location of each member.
That's correct, but by the time the compiler parses such code, the structure definition is complete and it can determine the offset of the data and next members.
And how does the compiler calculate the sizeof(struct node)?
After parsing the structure definition, the compiler has all the information needed to compute the size of structure instances. For such a computation, all pointers to structures and/or unions are assumed to have the same size.
Well, I think the critical concept to understand this seemingly circular issue is, a pointer's size is not relevant to what type it points to. And the size is fixed on a specific machine. A pointer's type is only meaningful at compile-time for the compiler to generate instructions for pointer calculation.
A pointer size is relevant to the type it points to, and on some architectures, different pointer types may have different sizes. Yet the C Standard specifies some constraints:
All pointer types can be converted to void * and back.
Pointers to all structures and unions have the same size.
On most modern architectures, and specifically on Posix compliant systems, all standard pointers types have the same size.
http://i64.tinypic.com/34ffxx2.jpg
Please have a look at that image. ( Link given above)
In the book, it is stated that the 'next' member of structure variable 'n1' will point to the 'value' member of structure variable 'n2'.
1:) Won't it point to the complete 'n2' structure since 'n2' is a structure variable and the 'next' pointer is pointing to 'n2' and not particularly to its 'value' member.
2:) Also, it is stated that it is completely fine for a structure to contain another structure with same name and data type. How's that possible? I get it we can have as many structures in a parent structure, but how come a member has a data type of the parent structure ?
Oups. The book is right, but your understanding is wrong...
n1.next actually points to n2. It just happens that value is the first member of the struct so it lies at same address as the whole struct
What is stated is that it is fine for a struct is that one of its elements points to another struct of same type. But it cannot contain it. This is a compilation error:
struct entry {
int value;
struct entry next; // Ouch, tries to contain self: ERROR!
};
Consider this struct definition:
struct node {
int age;
struct node* next;
};
How is it possible to have struct node *next in struct node {...} when it is in the definition of struct node itself?
What would the value of sizeof(struct node*) be?
With:
struct node* to_add = (struct node*)(malloc(sizeof(struct node));
Does to_add only have the address of the allocated memory?
Although I have used and implemented data structures in C/C++, however I have some basic doubts. Could you please help me understand the doubts which I have. I tried to search on-line, however the doubts still remains the same.
Because you only need a declaration of a type to define a pointer to that type. The start of the struct definition declares the type so you're ok to have pointers to that type inside it.
sizeof(struct node*) returns the size of the pointer
malloc returns a pointer to heap memory of the type's size but the memory is uninitialized.
The size is right but it's uninitialized.
Here's an MS Painted pictorial description of what such a node would look like:
Each node has a value (42, 69, or 613 in this example) and each holds a pointer to the next node, the last node holding a pointer to nothing.
This isn't an infinite recursive structure - a node doesn't have a node member, it has a node* member. And a pointer is just a pointer, another memory address. sizeof(node*) is the same as the size of any other pointer (whether 4 or 8 bytes). sizeof(node) is more interesting, but dependent on padding and other alignment issues. On a 32-bit machine, it'll probably be 8 bytes. On a 64-bit machine, it'll probably be 16 bytes (with 4 bytes of padding in the middle).
A declaration is sufficient to create a pointer. E.g.
struct node;
struct node* nodePtr = NULL;
The size of a pointer does not depend on the full definition of the object. Hence you can use:
struct node;
struct node* nodePtr = NULL;
size_t s = sizeof(nodePtr);
Since creation of a pointer does not depend on the full definition of the struct, it is possible to use:
struct node { // At this point struct node is a declared type.
int age;
// Since struct node is a declared type, you can create a pointer
struct node* next;
};
It's a pointer. Pointer definitions are naturally opaque--that is, the only thing that we need to know to be able to have a pointer is that the thing it points to has a constant size (as per all C and C++ non-templated types). Opaque pointers, in fact, are a way of making an encapsulated API in C/C++; it's used for stuff like SDL.
It should normally be the size as any other pointer. Pointers are usually implemented as simple memory addresses, but memory itself can be a complicated situation depending on the architecture and software it is running on. On x86-64, I believe the size is the same as one word, which is 8 bytes.
Depends on implementation or architecture. Normally (meaning x86 or x86-64), yes.
Yes.
One more thing; note that sizeof(Node) and sizeof(age) + sizeof(next) can be different sizes due to compiler alignment. Compilers often try to make memory multiples of word sizes or a different size (e.g. cache size) to make memory access better for architectures that optimize for alignment to word size.
I have a struct foo. Declaring a member of type foo* works:
typedef struct foo
{
struct foo* children[26];
} foo;
But if I try to declare a member of type foo I get an error:
typedef struct foo
{
struct foo children[26];
} foo;
This declaration gives me the error
definition of 'struct foo' is not complete until the closing '}'
A structure T cannot contain itself. How would you know its size? It would be impossible to do so, because the size of T would require you to know the size of T (because T contains another T). This turns into an infinite recursion.
You can have a pointer to T inside a structure T because the size of a pointer is not the same size as the pointed-to object: in this case, you would just store an address of memory where another T is stored - all the space you need to do that is basically the space you need to store a memory address where another T lives.
The structure Trie cannot contain another structure Trie in it , it will do a never - ending recursion but it may contain a pointer to another structure Trie
So first one is correct
typedef struct TRIE
{
bool is_endpoint;
bool is_initialized;
struct TRIE* children[26];
} TRIE;
Object of type T can't contains another non-static object of same type. If it may be possible, how to find size of that object? Size of pointer to object is always constant on current system.
Check value of currentptr for non-NULL before you can access fields of currentptr (like is_endpoint).
How does a C compiler (I'm using GCC) know what to do with the following?
struct node
{
int x;
struct node* next;
};
More precisely, if node has yet to be completely defined yet (we have not reached the closing curly brace), then how does the compiler know how big a struct ought to be?
While I realize that "pointing to" only requires an address, incrementing pointers does require the size of the data it points to.
The size of the struct is not important, as a pointer to the struct is being stored, not the struct itself.
In terms of incrementing pointers to struct; that is done outside of the struct definition, so again, is not important.