time complexity of iterating over an avl tree using nextInOrder - c

Let's say we have a binary AVL-tree that each node holds a pointer to the parent.
We also have a function that gives us the next item inorder, called treeSuccesor.
We can assume that its time complexity is O(log(N)).
What will be the time complexity of iterating over the tree with it, starting from the lowest value, and ending at the highest value
For the given AVL-tree, what will be the time complexity of iterating over it from the 17's node to 85's node using the treeSuccesor function?
iteration algorithm:
while (L != L2) // L2 is the ending node, 85 in the image
{
L = treeSuccesor(L);
}

Traversing any binary tree can be done in time O(n) since each link is passed twice: once going downwards and once going upwards. For each node the work is constant.
The complexity is not O(n log n) because even though the work of finding the next node is O(log n) in the worst case for an AVL tree (for a general binary tree it is even O(n)), the average work over all the nodes in the tree is constant.

Related

Implement data structure with efficient insert/findKthSmallest operations

I was asked this question by an interviewer. I tried solving it using an array, such that I make sure the array in sorted when I insert. However I don't think this is the best solution. What would be a good solution for this problem?
You can use a Balenced Binary Search Tree(eg: red-black bst) for this.
Simply insert the nodes in the balenced bst. and each node maintains its rank within its own subtree. As we need to find the smallest element we can maintain count of elements in left subtree.
Now, start traversing from root and check:
if k = N +1, where N is number of nodes in roots left subtree. if yes than root is kth node.
else if K < N, continue to search in the left subtree of root.
else if k > N +1, then continue to search in the right subtree. And search for (K-N-1) smallest element.
Time complexity of insertion in a balenced bst is O(logn). While searching for kth smallest element will be O(h), where h is height of the tree.
You can use an order statistic tree. Essentially, it is a self-balancing binary search tree where each node also stores the cardinality of its subtree. Maintaining the cardinality does not increase the complexity of any tree operations, but allows your query to be performed in O(log n) time:
If the left subtree has cardinality > k, recurse on the left subtree.
Otherwise, if the left subtree has cardinality < k, recurse on the right subtree.
Otherwise the left subtree's cardinality equals k, so return the current node's element.

Kth smallest in stream of numbers

We are given a stream of numbers and Q queries.
At each query, we are given a number k.
We need to find the kth smallest number at that point of the stream.
How to approach this problem?
total size of stream is < 10^5
1 < number < 10^9
I tried linked list but finding the right position is time-consuming and in array inserting is time-consuming.
You can use some kind of search tree. They are many different kind of search trees but all the common ones allow insertion in O(log n) and finding the kth element in O(log n) too.
If the stream is too long to keep all the numbers in memory and you also know an upper bound on k, you can prune the tree by only keeping a number of elements equal to the upper bound.
You can use a max heap with size=k.
Put elements until the heap's size reaches to k. After then, put an element and pop the heap's root so you can keep the size=k. Removing(extracting) root makes sense because there are at least k elements smaller than the root value.
When you finished iterating the stream, the root of the heap will be the k-th smallest element. Because you're having smallest k elements in the heap and the root is the largest among them.
As the heap's size is k, time complexity is O(n lg k) which could a bit better than O(n lg n). And the implementation would be a way easy.

Binary vs Linear search in sorted doubly linked lists

I am doing a college project on library management using doubly linked lists. The doubly linked list holds the ID'S of books whilst being sorted.
I have tried to calculate time elapsed for the worst case for linear vs binary search. Following results:
Binary search: 0.311ms
Linear search: 0.228ms
[Number of inputs(id's): 10000000]
MY QUESTION:
Even though binary search takes O(logn) comparisons, time elapsed was more due to the fact, it took O(n) traversals until the middle value is found. Is there any better search algorithm for a sorted doubly linked list rather than cumbersome linear search?
My implementation for finding middle value required for binary search:
struct node* middle(node* start, node* last)
{
if (start == NULL)
return NULL;
struct node* slow = start;
struct node* fast = start -> next;
while (fast != last)
{
fast = fast -> next;
if (fast != last)
{
slow = slow -> next;
fast = fast -> next;
}
}
return slow;
}
Your compare would have to to spectacularly slow to justify all that navigation. As it stands I cannot think of a better way than a linear search. If you can alter the structures and CRUD you can certainly index key points ("A" starts here, "B" starts here etc.) and that would allow you to better guess the start and direction of your linear search.
I think you'll find that a linked list, doubly or otherwise, is not a great choice for random lookups or updating in order. Use a B-Tree which seems to be a better fit for the situations you've outlined in your question and comments.
time elapsed was more due to the fact, it took O(n) traversals until the middle value is found.
When you insert new elements in the linked list you could also track the middle element like you're doing with the first and last one. Although the insert function will be more complex.
I would use a struct for the linked list with 4 fields:
start node
middle node
last node
length
Binary search achieves O(log N) complexity in the number of comparisons. When used with an array, accessing the i-th element of the array is performed in constant time, hence not affecting the overall time-complexity.
With a list, singly or doubly linked, accessing the i-th element takes i steps. In your example, accessing the middle element takes a number of steps proportional to the length of the list. As a consequence, the complexity of this search is still O(log N) comparisons but O(n) for the selection of the items to compare, which becomes the dominating factor.

Amortized time of insertion in sorted array is O(n) and deletion is O(1)?

I am learning how to analyze algorithms and I found the notation of "Amortized time". I found some predefined estimations like:
-Amortized time of insertion in a sorted array is: O(n)
And Amortized time of Deletion from a sorted array is: O(1)
Can anyone explain it to me in detail, please!
The idea is to associate with each entry in the array a Boolean called deleted. Deleting an item consists of setting deleted to true. When there are too many deleted items, compact them out. If you make your compaction threshold a fraction of the total size, you can pay for the compaction from all the deletions required to reach the compaction point.
Here's a sketch. It's incomplete but demonstrates the insertion and deletion algorithms.
class sorted_array
{
public:
typedef std::vector<std::pair<int, bool>>::iterator iterator;
iterator insert(int value)
{
auto item = std::make_pair(value, false);
return vec.insert(std::lower_bound(vec.begin(), vec.end(), item), item);
}
void erase(iterator pos)
{
pos->second = true; // deleted = true
deleted_count++;
if (deleted_count * 2 > vec.size())
{
vec.erase(std::remove_if(vec.begin(), vec.end(),
std::get<1, int, bool>), vec.end());
deleted_count = 0;
}
}
private:
size_t deleted_count = 0;
std::vector<std::pair<int, bool>> vec;
}
Insertion is O(n) as usual. When we insert the element, we also mark it as not deleted.
To delete a element, we merely mark it as deleted and bank two credits.
When more than half of the elements in the vector are deleted, then that means that we have at least as many credits as there are elements in the vector. That means we can afford to run the O(n) compaction.
To find an element, you run a traditional binary search, and then skip over deleted elements. Since at most half of the elements are deleted, the binary search operates on at most 2n elements, which means that it runs in O(log 2n) = O(log n) steps. There's a little bit of extra cost skipping forward past deleted items after the binary search completes, but some more cleverness in the data structure can reduce that to a constant. (Left as an exercise.)
Similarly, iterating over the collection will take at most 2n steps (because at most half of the elements are deleted), which is still O(n).

A recursive method to find depth of a any(not necessarily complete) Binary tree

I am trying to compute the depth of any(not necessarily complete) BST in O(log n) time recursively.
This is the algorithm I came up with:
//max() - returns the max of two numbers
int depth(root)
{
if(root->left==NULL && root->right==NULL) //leaf
return 0;
else if(root->left!=NULL && root->right==NULL) //with right leaf
return( max(depth(root->left),0)+1);
else if(root->left==NULL && root->right!=NULL) //with left leaf
return( max(0,depth(root->right)+1);
else if(root->left->left==NULL && root->left->right==NULL && root->right->left==NULL&&root->right->right==NULL) // this a parent of two leaves
return 1;
else// for the condition that this a parent of two sub roots
return( max(depth(root->right),depth(root->left))+1);
}
Is this algorithm fine for calculating depth in O(log n) time?
Is there a better way?
That's O(n) time since you may traverse every node doing that. You can do a search on a binary search tree in O(log n) but you can't find the depth of a binary tree in anything less than O(n) unless you cache the depths as you build it or do something similar.
There are two special cases you may want to be aware of.
A perfect binary tree can have its depth determined in O(log n). This means every leaf is at the same level.
Complete and balanced binary tree can have its depth approximated in O(log n) or in O(1) if the number of nodes is known. This will be approximate however (+/-1 usually).
The only way you'll get O(log(n)) runtime is if you only examine one path, and the only way you will be able to get away with examining one path is if you know that the tree has uniform height, and this only is the case with full binary trees, which your question specifically stated is not the case.
Therefore, there is no O(log(n)) algorithm that will determine the depth of a given binary tree.
You can only find the deepest node of an unknown, unbalanced tree by looking at every leaf node, which requires traversing the lot as you are doing - O(n).
As for a "better" way, you can't make it a lesser Order, but you don't need quite such complex code to achieve your result. Here is marginally less efficient implementation (because it recurses one level deeper) which is much more readable and more robust (if you pass in a NULL root pointer it won't go pop) approach:
int depth(root)
{
if (root == NULL)
return(0);
return(1 + max(depth(root->left), depth(root->right)));
}
A problem in C is that the function stack is not dynamicaly allocated on the heap, so at one point we will run out of space. Especially when each recursive call spawns two functions. In other words if your tree is somewhat balanced you will end up with log(N)^2 function calls. If you instead iterate over the left branches and recurse on the right ones, then the stack will not grow as fast.
int
depth(struct bt *root, int dl)
{
int dr, dmr;
for(dmr=dr=dl; root != NULL; dl++){
if((dr = depth(root->right, dl+1)) > dmr)
dmr = dr;
root = root->left;
}
return(dl > dmr ? dl : dmr);
}
This is the way I.E. Quick Sort is implemented in many operating systems:
http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/stdlib/qsort.c?rev=1.10;content-type=text%2Fx-cvsweb-markup
int maxDepth(BST *root)
{
int ldepth=0,rdepth=0,maxdepth=0;
if(root==NULL)
return(0);
ldepth=maxDepth(root->left);
rdepth=maxDepth(root->right);
maxdepth=MAX(ldepth,rdepth);
return(maxdepth+1);
}
int TreeDepthRec(Tree *pt){
if(!*pt)
return 0;
int a = TreeDepthRec(&(*pt)->left);
int b = TreeDepthRec(&(*pt)->right);
return (a>b)? 1+a : 1+b;
}
I think comparing two integer variables takes less time than calling a function.

Resources