i have a program with various recursive functions.
I now need to optimize the code to run the program faster: i checked with profiler and, a part from the biggest function with lots of checks, i have two functions that require a lot of time every run.
One (Unmarked_Nodes) is like this:
typedef struct node* tree;
struct node{
char* data;
tree left;
tree right;
int marker;
};
static int remaining = 0;
int main(){
...
}
int Unmarked_Nodes(tree root) {
if (root != NULL) {
Unmarked_Nodes(root->left);
if (root->marker == 0)
remaining++;
Unmarked_Nodes(root->right);
}
return remaining;
}
The other is similar but instead of the if cycle it has a printf of data.
The other, however, is faster than this... why? Or instead: how can i improve the code to make it run faster?
Thanks in advance
Candidate improvements: might help a little although answer remains O(n).
Recurse less often
Loop inside the function for one of the children.
Avoid global
Simply not needed.
Use const
No so much a speed improvement, yet allows for use with constant data.
Avoid hiding pointers
int Unmarked_Nodes(const struct node *root) {
int remaining = 0;
while (root != NULL) {
remaining += Unmarked_Nodes(root->left);
if (root->marker == 0) {
remaining++;
}
root = root->right;
}
return remaining;
}
Perhaps only recurse when both children are non-NULL. Test null-ness at the end of the loop since it is initially false for all recursive entry.
static int Unmarked_Nodes2r(const struct node *root) {
int remaining = 0;
do {
if (root->marker == 0) {
remaining++;
}
if (root->left) {
if (root->right) {
remaining += Unmarked_Nodesr(root->right);
}
root = root->left;
// continue; // Could skip loop test.
} else {
root = root->right;
}
} while (root);
return remaining;
}
int Unmarked_Nodes2(const struct node *root) {
return root ? Unmarked_Nodes2r(root) : 0;
}
In the absence of more information, it would seem that you likely "visit" the tree three times: once for 'marking' nodes (for whatever purpose), once to 'print' marked (or unmarked) nodes, and once more to reset those marks.
Presuming that 'marked nodes' are the interesting ones, consider using a dynamic array of pointers (malloc/realloc in suitable increments) to build a list of only those nodes, print from that list (no 2nd tree traversal), then free() the list (no 3rd tree traversal).
You wouldn't need to 'mark/unmark' anything. Interesting nodes added to the suggested list mean that those nodes are 'marked', and 'unmarked' when the list is erased.
You may need to consider if 'marking' may encounter unwanted duplicates.
Another suggestion is to consider transforming the tree into a list once it is filled. Then, use conventional binary search of that list to mark 'nodes', and a sweep through to erase marks (presuming the same list is to be reused multiple times.
Another suggestion relates to whether you are marking to include or to exclude from the print traversal. If marked nodes are included, then simply 'unmark' them as you print them. If marked nodes are excluded, then mark all those other unmarked nodes being printed that haven't previously been 'excluded' and remember whether '0' means 'marked' or if '1' means marked for the next time it comes to searching/marking.
Related
I have a self balancing key-value binary tree (similar to Tarjan's Zip Tree) where there will be duplication of keys. To ensure O(log N) performance the only thing I can come up with is to maintain three pointers per node; a less than, a greater than, and an "equals". The equals pointer is a pointer to a linked-list of members having the same key.
This seems memory inefficient to me because I'll have an extra 8 bytes per node in the whole tree to handle the infrequent duplicate occurrences. Is there a better way that doesn't involve "cheats" like bit banging the left or right pointers for use as a flag?
When you have a collision insertion, allocate new buffer, copy new data.
Hash the new data pointer down to one or two bytes. You'll need a hash that only returns zero on zero input!
Store the hash value in your node. This field would be zero if there are no collision data, so you are O(log KeyCount) for all keys without extra data elements. You're worst case is log KeyCount plus whatever your hashing algorithm yields on lookups, which might be a constant close to 1 additional step until your table has to be resized.
Obviously, choice of hashing algorithm is critical here. Look for one that is good with pointer values on whatever architecture you are targeting. You may need different hashes for different architectures.
You can carry this even further by using only one byte hash values that get you the hash table that you then use the key hash (can be a larger integer) to find the pointer to the additional data. When a hash table fills up, insert a new one into the parent table. I'll leave the math to you.
Regarding data locality. Since the node data are large, you already don't have good node record to actual data locality anyway. This scheme doesn't change that, except in the case where you have multiple data nodes for a particular key, in which case, you'd likely have cache miss getting to the correct index of a variable array embedded in the node. This scheme avoids having to reallocate the nodes on collisions, and probably won't have a severe impact on your cache miss rate.
I usually use this setup when i do a binary search tree, it skips in an array the duplicates values:
#include <stdio.h>
#include <stdlib.h>
#define SIZE 13
typedef struct Node
{
struct Node * right;
struct Node * left;
int value;
}TNode;
typedef TNode * Nodo;
void bst(int data, Nodo * p )
{
Nodo pp = *p;
if(pp == NULL)
{
pp = (Nodo)malloc(sizeof(struct Node));
pp->right = NULL;
pp->left = NULL;
pp->value = data;
*p = pp;
}
else if(data == pp->value)
{
return;
}
else if(data > pp->value)
{
bst(data, &pp->right);
}
else
{
bst(data, &pp->left);
}
}
void displayDesc(Nodo p)
{
if(p != NULL)
{
displayDesc(p->right);
printf("%d\n", p->value);
displayDesc(p->left);
}
}
void displayAsc(Nodo p)
{
if(p != NULL)
{
displayAsc(p->left);
printf("%d\n", p->value);
displayAsc(p->right);
}
}
int main()
{
int arr[SIZE] = {4,1,0,7,5,88,8,9,55,42,0,5,6};
Nodo head = NULL;
for(int i = 0; i < SIZE; i++)
{
bst(arr[i], &head);
}
displayAsc(head);
exit(0);
}
I am trying to sort a linked list but I'm not able to do it. I don't need to swap the nodes.
I've tried to solve the problem using an array-like sorting algorithm but it isn't correct
typedef struct list {
char ch;
int n;
struct list *next;
} List;
List *SortList (List *GeneralList)
{
int swapped, TempN;
char TempCh;
List *Current=NULL;
do
{
swapped=0;
for (Current=GeneralList; Current->next==NULL; Current=Current->next)
{
if (Current->ch>Current->next->ch)
{
TempN=Current->n;
TempCh=Current->ch;
Current->n=Current->next->n;
Current->ch=Current->next->ch;
Current->next->n=TempN;
Current->next->ch=TempCh;
}
swapped = 1;
}
}
while (swapped==0);
return GeneralList;
}
The kind of sort you are doing could be very slow if you have many and many links.
If you are able to do it, I suggest you to do an insertion sort : Each time you get a link, place it at the right place in your linked list.
Here is a link that should be useful :
Insertion sort
But if you really want a bubble sort (useful if you don't have much links) :
Bubble sort
When I make linked list sort functions, I usually use a void func, that will take **list as parameter, so I can swap nodes in the function.
Hope I helped you, good luck !
Here are a few changes to your code, and it should work now.
Please note the key changes:
In the Boolean expressions, I've moved the values to the left side, so that an assignment won't happen by mistake
The swap action was exported to an external function. I haven't changed your logic, though I do recommend to make a more generic swap function. If you haven't learned void* type yet, then perhaps export the internal logic to 2 functions, Swap(int,int) and Swap(char,char) and combine them in a SwapNode(List*,List*) function. The more functions the merrier!
The swapped variable got it's name changed, to swapFlag so it'll be a bit more clear what it does. The change of it's value to true was moved inside the if statement, just after the actual swap.
On the same topic, you'll want the loop to run while there are swaps happening. If the swap DIDN'T happen, then you stop.
The for loop should run while the list doesn't end. The condition was changed accordingly.
Changes that I think you should consider:
Consider changing the capitalization of letters depending on the word role. There are a few conventions out there. Try maybe doing all the variables with first letter small and the functions with the first letter Capital. It'll make your code much more readable.
When writing functions, make the brackets look like this: foo() and not like this: foo (). Also, much more readable.
Don't be stingy with new lines or spaces. When everything crammed together, it makes it harder to see the different areas of your code.
Think about using #DEFINE to use TRUE and FALSE instead of 1 and 0. Much more convenient.
Consider checking the case when GeneralList sequels to NULL in the beginning.
Now for the code:
typedef struct List {
char ch;
int n;
struct List *next;
} List;
void SwapNodeData( List* first, List* second )
{
int TempN;
char TempCh;
TempN = first->n;
TempCh = first->ch;
first->n = second->n;
first->ch = second->ch;
second->n = TempN;
second->ch = TempCh;
}
List* SortList(List* GeneralList)
{
int swapFlag;
List* Current = NULL;
do
{
swapFlag =0;
for (Current = GeneralList; NULL != Current->next; Current = Current->next)
{
if (Current->ch > Current->next->ch)
{
SwapNodeData( Current, Current->next );
swapFlag = 1;
}
}
}
while ( swapFlag );
return GeneralList;
}
I'd just like to note I already saw this post before asking my question: C How to "draw" a Binary Tree to the console
Let's say I have the following tree. If my print function were to print only the numbers (in order traversal), I would have the following printed out: 1,3,4,6,7,8,10,13,14.
What would be the best approach to draw the tree like something below considering the tree gets printed in that order?
I feel that if 8 got printed first followed by 3,10 etc.. it would be easier but since it is in-order traversal 1 is getting printed first which would be the first print statement at the top.
I did this about 2 years ago for some coursework...
I created a node struct that contained its own data and 2 nodes, one left and one right, it looked like this (I couldn't find the final code, that might have used shared pointers):
struct node
{
int data;
node *left;
node *right;
};
I then created my tree by adding more nodes to it using recursion like so:
void insert(node **tree, int value)
{
if (*tree == nullptr)
{
*tree = new node;
(*tree)->data = value;
(*tree)->left = nullptr;
(*tree)->right = nullptr;
}
else if (value < (*tree)->data)
{
insert(&((*tree)->left), value);//memory location of the pointer to the node of the node
}
else if (value > (*tree)->data)
{
insert(&((*tree)->right), value);
}
else
return;
}
Side note: Looking back, I never accounted for adding a node with the same value as an existing node if that's even possible.
I assume you will be doing something similar. Now for the bit that answers your question, printing it out, also using recursion.
void inorder(node *tree)
{
if (!(tree == nullptr))
{
inorder((tree)->left);
cout << (tree->data) << endl;//Prints on new lines, you could comma separate them if you really wanted.
inorder((tree)->right);
}
}
Lastly, you'll want to clean up your tree after you've used it so you'll need to delete it... recursively.
To be honest it's been a while and this recursion thing is still a little confusing to me so I have probably forgotten something, but the theory's there!
Edit, headers used: <iostream> and <memory>, also this was c++ not c but they're very similar.
I wrote the following function which returns the middle element of a linked list, which uses the double pointer method
struct node
{
int data;
struct node *next;
}*start;
void middleelement()
{
struct node *x=start,*y=start;
int n=0;
if(start==NULL)
{
printf("\nThere are no elments in the list");
}
else
{
while((x->next)!=NULL)
{
x=x->next->next;
y=y->next;
n++;
}
printf("\nMiddle element is %d",y->data);
}
}
However, whenever I run the functions, the Windows explorer stops working
What is the flaw in the code?
Is there any better algorithm than this to find the middle element?
If the number of entries is odd, your x will end up being NULL, so when the next loop iteration dreferences it, your program is going to crash. You should modify your condition to account for that:
while(x && x->next) {
...
}
Comparing with NULL is optional in C, so you can skip the != NULL to shorten the condition.
Of course passing the start parameter through a global variable is unorthodox, to say the least. It would be much better to pass it as a regular function parameter.
I was asked to write the iterative version, but I wrote the recursive version i.e.
void inorderTraverse(BinaryTree root)
{
if(root==NULL)
printf("%d",root->id);
else
{
inorderTraverse(root->left);
printf("%d",root->id);
inorderTraverse(root->right);
}
}
I'm not looking for the code, I want to understand how this can be done. Had it been just the last recursive call, I would have done
void inorderTraverse(BinaryTree root)
{
while(root!=NULL)
{
printf("%d",root->id);
root=root->right;
}
}
But how do I convert to an iterative program when there are two recursive calls?
Here are the type definitions.
struct element{
struct element* parent;
int id;
char* name;
struct element* left;
struct element* right;
};
typedef element* BinaryTree;
This is what I thought of, am I on the right track?
temp=root;
while(1)
{
while(temp!=NULL)
{
push(s,temp);
temp=temp->left;
continue;
}
temp=pop(s);
if(temp==NULL)
return;
printf("%d\t",temp->data);
temp=temp->right;
}
The problem you're seeing is that you need to "remember" the last place you were iterating at.
When doing recursion, the program internally uses "the stack" to remember where to go back to.
But when doing iteration, it doesn't.
Although... does that give you an idea?
I can't think of a really elegant way to do this iteratively off-hand.
One possibility might be using a 'mark algorithm', where you start out with all nodes 'unmarked' and 'mark' nodes as they're handled. The markers can be added to the object model or kept in a seperate entity.
Pseudocode:
for (BinaryTree currentNode = leftmostNode(root); currentNode != null; currentNode = nextNode(currentNode)):
print currentNode;
currentNode.seen = true;
sub nextNode(BinaryTree node):
if (!node.left.seen):
return leftmostNode(node.left)
else if (!node.seen)
return node
else if (!node.right.seen)
return leftmostNode(node.right)
else
return nextUnseenParent(node)
sub leftmostNode(BinaryTree node):
while (node.left != null)
node = node.left
return node;
sub nextUnseenParent(BinaryTree node):
while (node.parent.seen)
node = node.parent
return node.parent
I take it for granted, that iterating down from the parent nodes to the left nodes is not a problem. The problem is to know what to do when going up from one node to the parent: should you take the right child node or should you go up one more parent?
The following trick will help you:
Before going upwards remember the current node. Then go upwards. Now you can compare: Have you been in the left node: Then take the right node. Otherwise go up one more parent node.
You need only one reference/pointer for this.
There is a general way of converting recursive traversal to iterator by using a lazy iterator which concatenates multiple iterator suppliers (lambda expression which returns an iterator). See my Converting Recursive Traversal to Iterator.