I am trying to learn how structs work in C. I am familiar with constructors in Java. Now, I have an example of creating a tree in C with structs.
struct a_tree_node{
int value;
struct a_tree_node *leftPTR, *rightPTR;
};
I am currently trying to visualize how this works, I am a little confused because this struct contains itself.
I am a little confused because this struct contains itself.
The struct doesn't contain itself, but rather two pointers to the same kind of structure. That's the key point to understand.
The struct containing itself would be nonsense and wouldn't compile because it's an infinitely recursive dependency.
I think your confusion is comparing a struct to a constructor in Java. The closest equivalent in Java would be class:
class ATreeNode{
int value;
ATreeNode left;
ATreeNode right;
}
As the other answers have said, the left and right node in the struct are pointers - much like (but not quite the same as) references from Java.
The struct doesn't contain it self. It contains two pointers to its type. A very important distinction. Pointers are not of the type the point to but can rather be dereferenced into what they point to at a later time.
It doesn't contain itself it contains two pointers to the same defenition. The * in front of the leftPTR and rightPTR point to memory location where other a_tree_node's are stored.
The struct is defined in such a way that it forms a linked list. Inside the struct you define two pointers to structs. So, the struct does not contain itself, rather, it contains two pointers to two different instantiations of a struct. It is even possible the pointer is a pointer to the struct itself.
When coming from Java, you already know the necessary concepts, but lack the rigor C enforces on the concepts of data and pointers. leftPtr is just like a variable of class type (like Object) in Java, that is, it points to another object, might be Null or might point to another object.
It's just a linked list of int representing a binary tree.
It contains the address of a simlar structure.
Like lets take a tree node.
it means that a single tree node also stores the address of two other similar tree nodes.
Here in the question contains a pointer to struct a_tree_node.
The size of a pointer type is always constant i.e. sizeof(unsigned integer)
so it won't create any problem in defining the size of a struct a_tree_node.
It will not be a nested struct... :) :)
struct a_tree_node{int value;struct a_tree_node *leftPTR, *rightPTR; };
This code will work fine as we are referring pointer to structure not its object as size of pointer is not data type specific. It will depend on how much bit is your OS effectively your integer will take how much byte
e.g on gcc sizeof(int) is 4 so sizeof(leftPTR) is also same
so at run time there will be no recursion sizeof(a_tree_node)=12 (Not considering structure padding as it is compiler specific)
struct a_tree_node{int value;struct a_tree_node left;};
This declaration will leads to error as compiler wouldn't be able to compute its size
goes in infite recursion.
Related
Hello i am trying to learn and build data structures in c and i want to store integers progressively in the stack.
my struct is like this:
typedef struct STACK_NODE_s *STACK_NODE;
typedef struct STACK_NODE_s{
STACK_NODE forward;
void *storage;
} STACK_NODE_t;
typedef struct L_STACK_s{
STACK_NODE top;
} L_STACK_t, *L_STACK;
In a while loop i want to read and store my chars in integer form.
//assume that str is an proper string
//assume that we have a linked stack called LS
int i=0;
int temp;
while(str[i]!='\0'){
tmp=str[i]-'0';
push(LS,(void *)&tmp);
}
I know this won't work properly as we store the same variable's adress over and over again.
Do i need to allocate an auxiliary array in order to store them 1 by 1 or is there a better way to do this?
The answer must address two separate aspects of your question:
How to organize some collection of items, and where to get the memory from to do that.
First code snippet / Linked list format
The first code snippet is good the way it is.
It sets up a linked list, which has its pros and cons, but serves very well if you don't know the number of items in advance, if you want to be able to quickly remove or insert items somewhere in the middle of the list, and if you don't mind that looking up one certain entry inside the list costs you O(N) effort.
For a generic library-like implementation...
... void* is as good as it goes with ANSI C.
In C++, for example, you could make a template that leaves open the type that is stored in the list (or better yet, you would directly reuse the well-known STL implementation in class forward_list<int>).
Sadly, ANSI C doesn't have something comparable.
One solution is the one you picked, create int objects and hook their addresses into your list of void*.
Another solution for a generic library implementation is to use a precompiler macro for the type, and to define this macro above a header file that holds the generic implementation. This tries to resemble the clean C++ solution, but with precompiler it is not typesafe, so this approach is far from beautiful and comes with several risks.
Second code snippet / Memory allocation
Creating the list with void* instead of int (or whatever non-pointer type) requires you to allocate further memory beside the list.
I. e., it is not only that you have to allocate every list item (= variable of type STACK_NODE_t) but also the actual entry value (e. g., *(int*)(LS->storage)).
This means you have to allocate/deallocate the data in some other way that outlives the stack.
On most systems, you can use malloc/free for that, and you only have to take into account the size of the heap available for malloc and the time de-/allocating takes.
If the list shall implement real-time requirements or on embedded systems, you may not have malloc or you may not be allowed to use it.
Then you have to allocate and implement your own heap (= memory pool of storage items) for your list.
How to implement such a memory pool with desired properties is a separate question that would take us to far here.
In any case, you must not use the pointer to a stack variable (like a local variable inside a function) because the memory "behind" that variable will not be reserved for this purpose once the function exits, and the memory may be used for something different in the meantime.
This is, however, what the second code snippet does apparently.
As you noticed yourself, taking this path...
we store the same variable's adress over and over again.
Reusing the memory position for another entry of the same list is an extreme case of the risk explained above.
I solved the problem using an auxiliary array like i anticipated. If someone comes up with a better solution its more than welcome.
Ran into a design problem when using memcpy and building a generic HashTable in c. The HashTable maps unsigned int keys to void * data that I memcpy over.
// Random example
void foo() {
// Suppose `a` is a struct that contains LinkedLists, char arrays, etc
// within it.
struct *a = malloc(sizeof(a));
HashTable ht = ht_create(sizeof(a));
// Insert the (key, value) pair (0, a) into the hash table ht
ht_insert(ht, 0, a);
// Prevent memory leak
destroy_struct(a);
// Do stuff...
// ... eventually destroy ht
ht_destroy(ht);
}
Now, given struct a has LinkedLists and pointers within it, and the HashTable is using memcpy, my understanding is that it copies over shallow copies of these pointers. Thus, ht_insert mallocs space for a new entry, shallowly copies over data from a, and inserts the new entry into its table.
Consequently, unless I free struct a completely with some function destroy_struct, I am leaking memory. However, given I'm shallowly copying data in ht_insert, when I call destroy_struct(a), I will have accidentally freed the data pointed to within the hash table's entry as well!
Is the logic above correct, and if so, should I use a some recursive memcpy function that makes sure to deep copy all data from struct a to the HashTable?
Firstly, if your code doesn't reproduce the problem you are explaining, you shouldn't include it. The problem your code produces is compiler errors. This doesn't help your question, does it?
Now, given struct a has LinkedLists and pointers within it, and the HashTable is using memcpy, my understanding is that it copies over shallow copies of these pointers.
If you are simply copying the internal representation of a struct whatever * into the internal representation of a void *, then you are asking for trouble. There is no guarantee that the two representations are identical. It's possible that one pointer type might be larger than the other, that they use different endianness (if they're implemented as typical quasi-integers) or other internal differences might exist. You should convert one pointer to the other type, and then you could simply assign it... In fact, because one of the types is void * that conversion will happen implicitly when you assign.
Consequently, unless I free struct a completely with some function destroy_struct, I am leaking memory.
From what you have described, you should only call free on that pointer value once (and only once) you are done with it, and your program no longer has any use for it (e.g. after you have removed it from the hashtable). This goes for all non-null pointers that are returned by malloc, realloc or calloc. To clarify: if x and y store the same pointer returned by one of those functions, free should only be called ONCE on ONE OF THEM because they contain the same value.
Is the logic above correct, and if so, should I use a some recursive memcpy function that makes sure to deep copy all data from struct a to the HashTable?
I highly recommend breaking this question up into two or more separate questions, because it's double-barreled. I could simply answer "yes" (or "no"). Would that give you any meaningful information?
This brings me back to what I first wrote. I can only guide you based on what you've written here, which might not be reflective of the code that you use (especially given the influences of the erroneous code you've given). In order to guide you better, I would need to see all of the gaps filled in. I would need to see a testcase that creates a hashtable, inserts into the hashtable, uses the hashtable, removes from the hashtable and cleans up the hashtable to determine whether or not your operations are leaking anywhere... but most importantly, this testcase would need to be COMPILABLE! Otherwise it can't do any of those things, because it can't compile.
Suppose I have a
struct A{
char *name;
unsigned long *trunks;
bool value;
const struct smap *smap;
...
...
}
This struct has all types of data structures and I do not have direct exposure to the struct apart from struct A A*, which is a pointer to it.
You have to copy every element in the struct and all referenced objects to newly allocated structures the same way (recursively).
If the struct has only few pointers, you might use memcpy to copy all elements as-is first and then copy all referenced (through pointers) objects in a second pass. If there are many pointers, it might be more efficient to copy each field by hand.
Referenced objects must be treated identical (by recursion, iteration would be pretty nasty). However, for this, you need to know the structure of these types. Alternatively, there might be copy functions for all these objects in their implemenation file, thus keeping them opaque. If neither the structure, nor a copy function is available, you are somewhat lost, as there is no way to detect the pointers without that.
A problem will arise if there are circular references. Then things get even more complicated.
Recently, I am writing some Abstract Data Types (ADTs) for Queue in C Language.
But I found a problem for ADT in C:
How can I pass the type of data in C?
For example, in C++ I can use the template to pass type:
std::queue< struct mySt > myQ;
That template will pass "struct mySt" type to create myQ.
But how to do this in C?
All I know is to create a generic pointer pushing the data of "struct mySt" like below:
void enq(void *dataPtr);
and pop it using casting like below:
struct mySt *a = (struct mySt *) deq()
That seems work in C, but how can I perform "deep copy" action? I mean creating a new memory space for the content of the pointer dataPtr rather than just points it?
Except using Macro or Function pointer to solve this, is there other better way to solve it?
There isn't a simple way to do it in C. The C++ code relies on constructors (the copy constructor) to implement the copy, which is necessary because you can't tell a priori whether there are pointers in the class or allocated memory that need to be altered when an independent copy of the structure is made.
If you are going to copy the structures in a C ADT, then at a bare minimum you will need to specify the size of the structure to be copied as part of the interface. However, you really need a copy function that knows how to deal with the pointers in a copy of the structure.
Passing pointers around is simpler; it is clear that the object that is pointed at continues to exist unmodified by the fact that its pointer is now stored in a list.
This question already has answers here:
self referential struct definition?
(9 answers)
Closed 9 years ago.
Please help me to figure out a very basic confusion as follows,
struct node {
struct node *next; // no compile error
}
is ok, but the following gives an compile error(unknown type). I know it is wrong, but unable to figure out a clear reason.
struct node {
struct node next; // compile error, unknown type..why?
}
C allows you to have pointers to incomplete types.
struct node *next is a forward reference to struct node, but since you're only declaring a pointer to that type, the compiler doesn't mind. This is explicitly allowed, and it enables building structures that refer to each other.
You don't need a complete type to declare a pointer. Now a member struct node next; you'd get an error as it is a never-ending recursion.
Syntactically, it's because a struct type is incomplete until the closing }. You can use an incomplete type to declare a pointer to it, but the the type itself.
Furthermore, it doesn't make sense to define a struct that has itself in it, the size of it is unknown.
the struct is not fully defined at that point. If its a pointer, it does know how big a pointer to a struct is.
and if you think about what you are trying to do, a recursive data type, its gets a tad odd, it would be an infinite recursion.
The main reason is that node is not a complete type until you finish defining node and close it with } and so the compiler does not have enough information but you are allowed to have a pointer to an incomplete type.
The more basic reason is if node contained a node it would require infinite space since the self reference would never end. A node contains a node contains a node ad infinitum.
If we look at the draft C99 standard section 6.2.5 Types says:
[...] incomplete types (types that describe objects but lack information needed to determine their sizes)
and also says (*emphasis mine*0:
A pointer type may be derived from a function type, an object type, or an incomplete
type, called the referenced type.[...]
It would take an infinite amount of memory to store this struct.
The struct has to be large enough to store all of it`s members. However one of it's members is a struct of the same type, so it needs to store enough memory for two sets of members. But wait, that inner struct contains another instance of the struct inside itself - so we have three sets of members. And that inner contains an inner struct, which itself, contains an inner struct, and so on to infinity.
It is then logically impossible for a struct to contain itself.