Binary Trees in c - c

Guys I am new to data structures.Most of the time in books and references i see this structure for a binary tree
struct btree {
int data;
struct btree *left;
struct btree *right;
};
But in above image it would be like
struct btree
{
int data;
struct btree *left;
struct btree *right;
struct btree *parent;
};
So my question is that is it dependent on programmer to choose the structure of a node of a tree (for e.g also including a pointer to the parent )or we can have only two pointers one to the left child and other to the right child.

It is up to you whether you include parent pointers. They require more work to maintain, but some tree operations (like removing a node given just that node rather than its parent) become much easier.

It's up to you. They're generally not required for most tree operations, but they can speed up some of them, at the expense of the memory required to store the extra link. On the few occasions I've had to write a binary tree myself I've never used a parent link.

Absent something specifically saying otherwise, a binary tree doesn't normally contain pointers to parents as shown in that diagram. The "pointer to parent" is normally stored implicitly on the stack as you traverse the tree recursively.
There is a rather more common variant -- the "threaded binary tree" -- in this case, the leaf nodes where a normal binary tree would have NULL pointers instead have pointers to the next node in order and the previous node in order. This lets you walk through the tree in forward or reverse order without recursion.

The model without the parent node is better suited for recursive divide-and-conquer-style algorithms [for example, an pre-order or post-order traversal]. For more advanced operations, the parent pointer is needed.
What are you trying to do?

Related

Representation of Tree Data Structures

I've studied that Data Structures can be classified into Linear (arrays, stacks, queues and linked-lists) and Non-Linear (trees and graphs) data structures.
flowchart of data structures -- image source: medium
Now, my question is that if linked-lists are "linear" data structures, then why are they used to implement trees, which are non-linear? Since trees have nodes that consist of the keys (or values) and pointers to more child nodes, aren't trees just more complex linked-lists?
If they are, then how is the statement "linked-lists are linear data structures" justified?
This is how I actually got this question:
I am currently learning data structures in C. So in order to implement the Tree data structure, I use structs, which consist of keys and pointers to the left and right children of each node.
typedef struct Node
{
int key;
struct Node *left;
struct Node *right;
} Node;
Then I wondered that I'm essentially implementing a linked-list (since linked-lists are also done using structs in C).
You are confusing different things.
A tree is logical data-structure which is a special case of a directed graph without cycles.
A tree data-structure can be implemented in various ways. An array which stores indices of children, using actual pointers to memory of children (like in your example), and other ways. your struct is a specific implementation of a tree data structure but there are other implementations.
A 'linked list' typically reffers to specific implementation where elements point to each others memory. There is a one directional linked list, where element points to next one, or bi-directional, where element points to previous and next one.
If you implement trees with pointers and each node has only one child, then this is a special case where the implementation resembles a linked list.
Note: that a linked list may have a loop, while a tree never has a loop, because then it becomes graph (by definition).
Also it is not common for a tree node to point back to its parent, only point to its children, while linked lists sometimes point to parent (previous element).
So linked list is an implementation of logical data structure which is called 'list'. This implementation uses pointers.
List can be implemented in other ways (arrays, histograms, hash tables with counters of amount of appearances of each elements, skip lists for O(log(n)) search etc).
A tree is a logical data structre which is commonly implemented using pointers, but has other implementations as well.
When tree is implemented using pointers, each 1 branch in a tree, resembles a linked list - so this is the sources of your confusion.

How to serialize a Graph-like AVL tree to disk?

I know it sounds weird but this is it. I have a data structure which is basically a modified AVL tree. Each node of the the structure has a left child and a right child. These core pointers (left & right) will be used to link all data nodes together and to keep the data structure balanced (AVL rotations) to improve searching. But those are not the only pointers in the structure, there are others that can point to any random node in the tree (Which creates the graph-like analogy).
The tree is built at runtime through user interaction (CLI). The user is also responsible for creating all the different links between the nodes.
An example of such a data structure could be (Didn't start coding yet, it's only prototyping):
struct node {
struct node *left;
struct node *right
struct node *links[NUM]; // Points to any random node in the tree.
/* Probably many other fields here that could be either pointers
or other data types */
}
Now, everything is in RAM. Once the user wants to exit, all the data nodes (The whole tree) should be saved to a file in binary mode (For later reloading, so one must take this in consideration).
It's, basically, easy to save the AVL tree using one the recursive tree traversal algorithms (In this case the question is a duplicate because solutions already exist in SO). But, in my case, i have to preserve all the arbitrarily created links between the nodes.
What could be the most efficient way in time & space ?
You could dump your data structure as is (including the pointer values) and, in the binary blob of each node, also add its address. When reloading the data structure you will dynamically allocate your nodes and store their new addresses in a hash table which access keys are the old addresses. In a final pass you will parse your hash table sequentially (not using the old addresses as keys), retrieve the new address of each node, and update its pointer fields from old addresses to new addresses using again your hash table as a translation table (with the old addresses as access keys).
Choose a unique index number for each node, and use it to serialize the links.
This will likely take two traversal passes -- one to set the index number, and one to do the serialization. Add an integer field to your node to hold the index number; you shouldn't need any other memory overhead.
Alternately, if you manage your tree nodes by storing them in an array or std::vector, you will already have an index number handy, and you won't need an additional index field. Also, you can store all your links as indices instead of pointers, so you can just serialize your container as-is.

Tree number of levels

I wrote a tree in which each node has a list of his children. So my questions are : How can I compute the number of levels of my tree? Can anyone give me some documentation about this ?. Thank you :).
There could be different ways to solve this issue. One solution could be like count the nodes using a counter variable and increment the counter until the leaf node reaches. But you have to take care of
1. Longest Chain
2. Redundancy in couting
Second if each node has a list of children then count the nodes through this list present in root node.
A very appropriate way would be to define a variable by name Level into the struct node
typedef struct node {
...
Other members
...
int node_level;
} NODE;
and initialize it with 1 when root or any other node is created. Then update its value on each insertion into the tree.
By doing so you can see the level of any child tree, whenever you need to find. Also note each inserted node would have a level 1 and its ancestors will have greater level.

How do I find the structure that is pointing to the current structure in C?

I have an array of structures called box_pool. In this array each structure points to two other structures in order to create some organisation (I am creating a binary search tree). But what if I currently have a structure that I want to delete, but in order to do that I need to redirect the pointer of the previous structure to the next structure. How could I do that?
You need to visit the structure starting from the root, there is no other simple way unless you store bidirectional links, so that you have
struct node
{
struct node* left;
struct node* right;
struct node* parent;
}
This will help finding where the node is stored but will complicate things as you will have to keep parent updated, and waste some space. It's a typical trade-off.

Is there strong reason not to include multiple node pointers in a node to use in more than one data structure?

Take for example the assignment I'm working on. We're to use a binary search tree for one piece of a set of data and then a linked list for another piece in the set. The suggested method by the professor was:
struct treeNode
{
data * item;
treeNode *left, *right;
};
struct listNode
{
data * item;
listNode *next, *prev;
};
class collection
{
public:
........
}
Where data is a class containing the particulars of each record. Obviously as it's set up, a treeNode can't exist in the linked list.
Wouldn't it be much simpler to:
struct node
{
data * item;
node *listNext, *listPrev, *treeLeft, *treeRight;
};
then we can declare:
node * listHead;
node * treeRoot;
and include both insertion algorithms into the class.
Is there something I'm missing?
Actually, the data items are to be inserted into both lists. The (mundane) purpose of the assignment is to sort the data sets in two different elements in the set.
So with that said, wouldn't I be saving memory? Combining the 2 nodes I end up with 5 pointers, if I left them separate I'd be using 6. Also I really only have one group of data this way. if I had 250 data items to keep track of, I'd have one group of 1250 pointers instead of 2 lists of 750. Maybe I'm misunderstanding what actually gets allocated with pointer calls.
You can do that, but you are wasting memory with the extra pointers. Also, it tends to be more confusing to mix types like that. Am I correct in assuming that the data is either put into the list or put into the tree, but not inserted into both? There's really not much reason to have them both use the same structure if they are different data types anyway. If you are inserting the same data into both types, you could potentially switch from traversing the tree to traversing the list if you had any use for such an action.
Since you're inserting the data into both lists, It would save memory to use your composite node structure. I would insert into the binary tree first, then insert the allocated node into the linked list. You wouldn't really end up with a pure linked list or a binary search tree, but it would be able to be traversed like either one.
What was the answer?
If your data is less than (hmmm) megabytes, don't worry about memory consumption. 1 or 2 Gigabytes is typical in normal computers today.
How big are the items? 32 char? 64k of compressed multimedia? Something big?
How reasonable is it to organize one item using both techniques? If the data are really the same, then a 5 pointer structure is interesting- someone could find a node in one ordering and then browse related nodes in the other ordering.
Are the items unrelated, some chalk, some cheese? Are they multidimensional? personnel records? Audio file descriptions? Recipes?
In school, a good teacher is trying to give you experience with common techniques and disciplines. Just like art class, or composition. Pencil, pastels, 5 paragraph essay. So the teacher might want you to write two different classes & constructors. Use one struct for one part of the data, different one for other data. Or the same. Just because.
Outside of school, the data comes in a format and there are operations desired on it/with it. "Use cases" are stories about how data is used, what has to be kept, what algorithms are used.
The point of this might be bimodal searching, 2 pairs of orthogonal pointers. It might be Unions, where each item is asssociated with a list or a tree, but not both at the same time. It might be a flurry of lightwieght subsets, trees and lists, that are compared and contrasted...
When in doubt, "data structures + algorithms = programs". But it pays to know what point the teacher is trying to make, and whether you want to follow their lead. (Usually, in school, you do.)

Resources