Related
I am a student of computer science, and I had an exam last week in C.
One of the questions was to search a specific word (string) in a binary tree, and count how many times it appears.
Every node in the tree contains a letter.
For example, if the word is "mom", and the tree looks like the attached image, the function should return 2.
Pay attention that if there is a word like this — "momom" — the function will count the "mom" only one time.
I have not been able to solve this question. Can you help?
a
/ \
b m
/ / \
v o o
/ \ \
m t m
So basically, because the tree in your image does not appear to be ordered or balanced, so you would have to search every branch until either you hit a match, or you hit a leaf. Once you hit a match, you could ignore all the branches underneath because they're irrelevant. But outside of this, you don't know the depth of the tree, so you can't end searching prematurely based on depth.
So, your algorithm would be something to the effect of:
// returns the number of matches
// matchMask is a bitmap of the string sublengths that match so far...
int search(const char *substr, int substrlen, uint32_t matchMask, node_t *node) {
uint16_t newMatchMask = 0;
int bit;
ASSERT(substrlen < (sizeof(matchMask)*8));
if (node == NULL) {
// hit a leaf, stop return 0
return 0;
}
while (bit = LSB(matchMask) != -1)
{
if (node->ch == substr[bit+1])
newMatchMask |= (1 << (bit+1));
}
if (node->ch == substr[0])
newMatchMask;
if (newMatchMask & (1 << strlen)) {
// found a match, don't bother recursing
return 1;
} else {
return
search(substr, substrlen, newMatchMask, node->left) +
search(substr, substrlen, newMatchMask, node->right);
}
}
Note, that I had to do some funky bitmap stuff there to keep track of the depths matched so far, as you can match a partial substring along the way. LSB is assumed to be a least-significant-bit macro that returns -1 if no bits are set. Also, this is not tested, so there might be an off-by-one error in the bit masking, but the idea is still there.
-- EDIT --
oops, forgot to stop recursing if your node is blank... Fixing
You want to enumerate all words in the tree and check at each end of word if you have a match using strstr().
The keywords to search for would be tree walking tree depth-first.
The semantics of your tree structure are confused. To clarify your question, you should enumerate all words present in the tree by hand, then write a function that walks the tree and prints the same list, the final step is easy: instead of printing them, check if the string matches with strstr and count the matching words.
Suppose the question is we want to find whether a path is possible from Source S to Destination D in the graph given. Graph represented by characters '.','X','S','D' . '.' represents free space, 'X' represents blocked area, 'S' is Source 'D' is destination.
Suppose the given graph is represented as 2D array like this
...SXX..
XX...X..
X..X....
XX....X.
XXXX....
...D....
I know that we can use DFS or BFS but the problem is how to perform these when the graph is given in the form of 2D array. Is converting this Matrix to Adjacency list the efficient way or we can directly apply DFS or BFS ? If yes,then how ?
Converting this matrix to an adjacency list, or even an adjacency table, would take O(4e) where e represents the number of entries in the array. After that, finding if they are linked by BFS or DFS would just be around O(4e) since the number of edges is bounded by 4e, one edge per up, down, left, and right. Thus, conversion then BFS or DFS would take about O(8e).
An algorithm that does not do the conversion is as follows (it is a slightly modified BFS):
int x
int y
char givenGraph[x][y]
boolean pathExists
// sourceX and sourceY represent the location of the 'S'
start(int sourceX, int sourceY, int destinationX, int destinationY) {
recursiveCheck(sourceX, sourceY, destinationX, destinationY))
if(!pathExists) {
print("Path does not exist!")
}
}
recursiveCheck(int currentX, int currentY) {
if(givenGraph[currentX][currentY] == 'D') { // if the destination then say so
print("Path exists!")
pathExists = true
}
else if(givenGraph[currentX][currentY] == 'X') { // if a wall then return
return
}
else if(givenGraph[currentX][currentY] == 'C') { // if already checked return
return
}
else { // not checked yet, either 'S' or '.' so mark
givenGraph[currentX][currentY] = 'C' // for checked
}
if(currentX + 1 < x) { // check left
recursiveCheck(currentX + 1, currentY)
}
if(currentX - 1 >= 0) { // check right
recursiveCheck(currentX - 1, currentY)
}
if(currentY + 1 < y) { // check up
recursiveCheck(currentX, currentY + 1)
}
if(currentY - 1 >= 0) { // check down
recursiveCheck(currentX, currentY - 1)
}
}
This recursive algorithm checks up, down, left, and right for each entry, and it assumes that the "S" location is known. With the 'S' known, the complexity is about O(4e). Finding 'S' would take O(e) by just searching all the entries in the table. Therefore, the efficiency of this algorithm is O(5e).
The conversion can be optimized further, as can the above algorithm. This simplified non-conversion version was to show that it can be as efficient or more efficient then a conversion.
On a side not, this recursive algorithm does overwrite the 'S'. It would have to be modified slightly to not overwrite 'S'.
I have an assignment to implement a binary heap. However, I'm not sure whether I should implement the binary heap as a binary tree data structure or a simple double linked list.
If I should implement as a binary tree, how should I keep track of the last element of the tree in order to insert a new element? In linked list that would be much easier.
So, does binary heap have to be a binary tree? If yes, how to track the last element?
Note: In my assignment there is a statement like this:
But you will implement the binary heap not as an array, but
as a tree.
To be more clear this is my node:
struct Word{
char * word;
int count;
struct Word * parent;
struct Word * left_child;
struct Word * right_child;
}
Solution taken from the question.
by #Yunus Eren Güzel
SOLVED:
After five hours of study I have found a way to implement heap as a pointer based tree.
The insertion algorithm is :
insert
node = create_a_node
parent = get_the_last_parent
node->parent = parent
if parent->left==NULL
parent->left=node
else
parent->right=node
end insert
get_last_parent parent,&height
height++
if parent->left==NULL || parent->right==NULL
return parent;
else
int left_height=0,right_height=0;
left = get_last_parent(parent->left,&left_height)
right = get_last_parent(parent->right,&right_height)
if left_height == right_height
height += right_height
return right
else if left_height > right_height
height += left_height
return left
end get_last_parent
A binary heap is, by definition, a binary tree. One way of implementing this in C is to store the tree elements in an array where the array index corresponds to the tree element (numbering the root node 0, its left child 1, its right child 2, and so on). You can then just store the size of the heap (initialized to 0 upon creation and incremented whenever an element is added) and use that to find the next open location.
For basic data structures questions like this, Wikipedia is your friend.
You should implement it as a tree. It will be easy and interesting. Heap has only property that any node has value less than or equal to its parent, if it is a max heap.
In array implementation we impose some more conditions.
If you need help about specific function implementation then you can ask it.
You need to travel down to add new node
call it with root, value to be inserted
insert(node, x){
if(node->value >= x)
//insert
if(node->left == 0)
node->left = new Node(x);
else if(node->right == 0)
node->right = new Node(x);
else if(node->left->value >= x)
insert(node->left, x);
else if(node->right->value >= x)
insert(node->right, x);
else
//insert between node and its any one child
insertBW(node, node->left, x);
else //if x is less than node value
//insert between node and its parent
insertBW(node->parent, node, x)
}
insertBW(p, c) is a function which insets a node containing value x between p and c
(I didn't tested this code please check for errors)
insertBW(Node* p, Node* c, T x)
{
Node* newnode = new Node(x);
newNode.x = x;
if(p == 0) //if node c is root
{
newnode.left = Tree.root.left;
Tree.root = newnode;
}
else
{
newnode.parent = p;
newnode.child = c;
if(p.left == c)
{
p.left = newnode;
}
else p.right = newnode;
}
}
This to me really seems to be a homework question & it seems you have not done any R&D on your own before asking (sorry for bit harsh words):)
In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B).
I think your teacher wants you to implement a priority queue data structure and that's where you are talking about both a Linked List and Heap together in the same question. Priority Queue can be implemented as a Heap or a Linked List where in to extract elements based on priority either you have to maintain elements sorted in case of linked list where say a maximum or minimum element goes at the front based upon whether you are implementing a max heap or a min heap OR priority queue can be implemented simply as a heap data structure.
Coming to the last point where you say "But you will implement the binary heap not as an array, but as a tree.", seems to be very irrelevant. Please do check again as to what is required or reproduce the exact question that has been asked in your assignment.
To put it simply, regarding your first question - no. A heap can be anything (array, linked list, tree, and when one must improvise a family of fluffy kittens). Note the definition of a heap: If "B" is a child of "A" then val(A) >= val(B) (or, in case of a min-heap, val(A) <= val(B)).
It's most common to refer to it as a tree (and also implement it as such) because it's easy to think of it as a tree. Also, the time-complexity & performance are good.
Regarding your second question, you gave no information, so as far as I know a solution that searches every node is as good as any other...
For any better answer, more information is required (what limitations do you have, what operations should you support, etc...)
A binary heap can be anything i.e. array, linked list, tree, etc. We just have to keep the right algorithm on how can you can access the data. For example, if you want to make it to the left child you can do this by 2N + 1(For starting index 0) where N is the parent index or the right child by 2N + 2. For the last element, you can initialise the heap with a variable count and increment it by 1 every time you insert a new element, this way you can keep track of the last element (Same for delete, just some modification has to be made on the collection).
I need to write a program to find the mode. Or the most occurrence of an integer or integers.
So,
1,2,3,4,1,10,4,23,12,4,1 would have mode of 1 and 4.
I'm not really sure what kind of algorithm i should use. I'm having a hard time trying to think of something that would work.
I was thinking of a frequency table of some sort maybe where i could go through array and then go through and create a linked list maybe. If the linked doesn't contain that value add it to the linked, if it does then add 1 to the value.
So if i had the same thing from above. loop through
1,2,3,4,1,10,4,23,12,4,1
Then list is empty so add node with number = 1 and value = 1.
2 doesnt exist so add node with number = 2 and value = 1 and so on.
Get to the 1 and 1 already exists so value = 2 now.
I would have to loop through the array and then loop through linked list everytime to find that value.
Once i am done then go through the linked list and create a new linked list that will hold the modes. So i set the head to the first element which is 1. Then i go through the linked list that contains the occurences and compare the values. If the occurences of the current node is > the current highest then i set the head to this node. If its = to the highest then i add the node to the mode linked list.
Once i am done i loop through the mode list and print the values.
Not sure if this would work. Does anyone see anything wrong with this? Is there an easier way to do this? I was thinking a hash table too, but not really sure how to do that in C.
Thanks.
If you can keep the entire list of integers in memory, you could sort the list first, which will make repeated values adjacent to each other. Then you can do a single pass over the sorted list to look for the mode. That way, you only need to keep track of the best candidate(s) for the mode seen up until now, along with how many times the current value has been seen so far.
The algorithm you have is fine for a homework assignment. There are all sorts of things you could do to optimise the code, such as:
use a binary tree for efficiency,
use an array of counts where the index is the number (assuming the number range is limited).
But I think you'll find they're not necessary in this case. For homework, the intent is just to show that you understand how to program, not that you know all sorts of tricks for wringing out the last ounce of performance. Your educator will be looking far more for readable, structured, code than tricky optimisations.
I'll describe below what I'd do. You're obviously free to use my advice as much or as little as you wish, depending on how much satisfaction you want to gain at doing it yourself. I'll provide pseudo-code only, which is my standard practice for homework questions.
I would start with a structure holding a number, a count and next pointer (for your linked list) and the global pointer to the first one:
typedef struct sElement {
int number;
int count;
struct sElement *next;
} tElement;
tElement first = NULL;
Then create some functions for creating and using the list:
tElement *incrementElement (int number);
tElement *getMaxCountElement (void);
tElement *getNextMatching (tElement *ptr, int count);
Those functions will, respectively:
Increment the count for an element (or create it and set count to 1).
Scan all the elements returning the maximum count.
Get the next element pointer matching the count, starting at a given point, or NULL if no more.
The pseudo-code for each:
def incrementElement (number):
# Find matching number in list or NULL.
set ptr to first
while ptr is not NULL:
if ptr->number is equal to number:
return ptr
set ptr to ptr->next
# If not found, add one at start with zero count.
if ptr is NULL:
set ptr to newly allocated element
set ptr->number to number
set ptr->count to 0
set ptr->next to first
set first to ptr
# Increment count.
set ptr->count to ptr->count + 1
def getMaxCountElement (number):
# List empty, no mode.
if first is NULL:
return NULL
# Assume first element is mode to start with.
set retptr to first
# Process all other elements.
set ptr to first->next
while ptr is not NULL:
# Save new mode if you find one.
if ptr->count is greater than retptr->count:
set retptr to ptr
set ptr to ptr->next
# Return actual mode element pointer.
return retptr
def getNextMatching (ptr, number):
# Process all elements.
while ptr is not NULL:
# If match on count, return it.
if ptr->number is equal to number:
return ptr
set ptr to ptr->next
# Went through whole list with no match, return NULL.
return NULL
Then your main program becomes:
# Process all the numbers, adding to (or incrementing in) list .
for each n in numbers to process:
incrementElement (n)
# Get the mode quantity, only look for modes if list was non-empty.
maxElem = getMaxCountElement ()
if maxElem is not NULL:
# Find the first one, whil exists, print and find the next one.
ptr = getNextMatching (first, maxElem->count)
while ptr is not NULL:
print ptr->number
ptr = getNextMatching (ptr->next, maxElem->count)
If the range of numbers is known in advance, and is a reasonable number, you can allocate a sufficiently large array for the counters and just do count[i] += 1.
If the range of numbers is not known in advance, or is too large for the naive use of an array, you could instead maintain a binary tree of values to maintain your counters. This will give you far less searching than a linked list would. Either way you'd have to traverse the array or tree and build an ordering of highest to lowest counts. Again I'd recommend a tree for that, but your list solution could work as well.
Another interesting option could be the use of a priority queue for your extraction phase. Once you have your list of counters completed, walk your tree and insert each value at a priority equal to its count. Then you just pull values from the priority queue until the count goes down.
I would go for a simple hash table based solution.
A structure for hash table containing a number and corresponding frequency. Plus a pointer to the next element for chaining in the hash bucket.
struct ItemFreq {
struct ItemFreq * next_;
int number_;
int frequency_;
};
The processing starts with
max_freq_so_far = 0;
It goes through the list of numbers. For each number, the hash table is looked up for a ItemFreq element x such that x.number_ == number.
If no such x is found, then a ItemFreq element is created as { number_ = number, frequency_ = 1} and inserted into the hash table.
If some x was found then its frequency_ is incremented.
If frequency_ > max_freq_so_far then max_freq_so_far = frequency
Once traversing through the list of numbers of complete, we traverse through the hash table and print the ItemFreq items whose frequency_ == max_freq_so_far
The complexity of the algorithm is O(N) where N is the number of items in the input list.
For a simple and elegant construction of hash table, see section 6.6 of K&R (The C Programming Language).
This response is a sample for the idea of Paul Kuliniewicz:
int CompInt(const void* ptr1, const void* ptr2) {
const int a = *(int*)ptr1;
const int b = *(int*)ptr2;
if (a < b) return -1;
if (a > b) return +1;
return 0;
}
// This function leave the modes in output and return the number
// of modes in output. The output pointer should be available to
// hold at least n integers.
int GetModes(const int* v, int n, int* output) {
// Sort the data and initialize the best result.
qsort(v, v + n, CompInt);
int outputSize = 0;
// Loop through elements while there are not exhausted.
// (look there is no ++i after each iteration).
for (int i = 0; i < n;) {
// This is the begin of the new group.
const int begin = i;
// Move the pointer until there are no more equal elements.
for (; i < n && v[i] == v[begin]; ++i);
// This is one-past the last element in the current group.
const int end = i;
// Update the best mode found until now.
if (end - begin > best) {
best = end - begin;
outputSize = 0;
}
if (end - begin == best)
output[outputSize++] = v[begin];
}
return outputSize;
}
If I have two binary trees, how would I check if the elements in all the nodes are equal.
Any ideas on how to solve this problem?
You would do a parallel tree traversal - choose your order (pre-order, post-order, in-order). If at any time the values stored in the current nodes differ, so do the two trees. If one left node is null and the other isn't, the trees are different; ditto for right nodes.
Does node order matters? I'm assuming for this answer that the two following trees :
1 1
/ \ / \
3 2 2 3
are not equal, because node position and order is taken into account for the comparison.
A few hints
Do you agree that two empty trees are equal?
Do you agree that two trees that only have a root node, with identical node values, are equal?
Can't you generalize this approach?
Being a bit more precise
Consider this generic tree:
rootnode(value=V)
/ \
/ \
-------- -------
| left | | right |
| subtree| |subtree|
-------- -------
rootnode is a single node. The two children are more generic, and represent binary trees. The children can either be empty, or a single node, or a fully-grown binary tree.
Do you agree that this representation is generic enough to represent any kind of non-empty binary tree? Are you able to decompose, say, this simple tree into my representation?
If you understand this concept, then this decomposition can help you to solve the problem. If you do understand the concept, but can't go any further with the algorithm, please comment here and I'll be a bit more specific :)
you could use something like Tree Traversal to check each value.
If the trees are binary search trees, so that a pre-order walk will produce a reliable, repeatable ordering of items, the existing answers will work. If they're arbitrary binary trees, you have a much more interesting problem, and should look into hash tables.
My solution would be to flatten the two trees into 2 arrays (using level order), and then iterate through each item and compare. You know both arrays are the same order. You can do simple pre-checks such as if the array sizes differ then the two trees aren't the same.
Level Order is fairly easy to implement, the Wikipedia article on tree traversal basically gives you everything you need, including code. If efficiency is being asked for in the question, then a non-recursive solution is best, and done using a FIFO list (a Queue in C# parlance - I'm not a C programmer).
Let the two tree pass through same tree traversal logic and match the outputs. If even a single node data does not match the trees dont match.
Or you could just create a simple tree traversal logic and compare the node values at each recursion.
You can use pointers and recursion to check if node is equal, then check subtrees. The code can be writen as following in Java language.
public boolean sameTree(Node root1, Node root2){
//base case :both are empty
if(root1==null && root2==null )
return true;
if(root1.equals(root2)) {
//subtrees
boolean left=sameTree(root1.left,root2.left);
boolean right=sameTree(root1.right,root2.right);
return (left && right);
}//end if
else{
return false;
}//end else
}//end sameTree()
Writing a C code as a tag mentions in the question.
int is_same(node* T1,node* T2)
{
if(!T1 && !T2)
return 1;
if(!T1 || !T2)
return 0;
if(T1->data == T2->data)
{
int left = is_same(T1->left,T2->left);
int right = is_same(T1->right,T2->right);
return (left && right);
}
else
return 0;
}
Takes care of structure as well as values.
One line code is enough to check if two binary tree node are equal (same value and same structure) or not.
bool isEqual(BinaryTreeNode *a, BinaryTreeNode *b)
{
return (a && b) ? (a->m_nValue==b->m_nValue && isEqual(a->m_pLeft,b->m_pLeft) && isEqual(a->m_pRight,b->m_pRight)) : (a == b);
}
If your values are numerical int, in a known range, you can use an array, (let's say max value n). Traverse through the 1st tree using whatever method you want, adding the data into a said array, in an appropriate index (using the node data as index). Then, traverse through the second tree and check for every node in it, if array[node.data] is not null. If not - trees are identical.
**assuming for each tree all nodes are unique