## How to implement an A* search algorithm where h(n) = h*(n)? - artificial-intelligence

I'm interested in the steps/logic behind implementing an A* search algorithm if we wanted our h(n) value for every n to be exactly the perfect heuristic value (h*(n)).
Am I correct in the assumption that for each node, we would have to perform 1 A* traversal of the tree from that node till the end node in order to calculate h*(n) for it? I know admissible heuristics aim to get as close to h*(n) as possible to reduce checking nodes/paths that are not optimal.

If your heuristic is "perfect" (ie. it represents the actual distance), that will always be consistent so there's nothing special to do. The next node dequeued will always be the next node in the path.
In fact in this case you can skip A* altogether and just immediately calculate the best path.

## Related

### Breadth first search solution path

I had a question regarding BFS. After expanding nodes whether it be a graph or a tree what path will BFS take as the solution from the starting point to the goal? Does it take into account the cost of moving from one node to the other and takes the lowest cost route or does it take the path with the least amount of nodes needed to get to the goal?

The classical breadth-first search algorithm does not take the weight of the edges into account. In each iteration, you simply put the direct reachable neighbors of the current node in the queue without any checks. You can find the shortest path between two nodes A and B in the sense of a minimal amount of "steps" that are necessary to reach node B from node A.

### tree-decomposition of a graph

I need a start point to implement an algorithm in c to generate a tre-decomposition of a graph in input. What i'm looking for it's an algorithm to do this thing. i will like to have a pseudocode of the algorithm, i don't care about the programming language and I do not care about complexity On the web there is a lot of theory but nothing in practice. I've tried to understand how to do an algorithm that can be implemented in c. But it's to hard i've tried to use the following information: Algorithm for generating a tree decomposition https://math.mit.edu/~apost/courses/18.204-2016/18.204_Gerrod_Voigt_final_paper.pdf and a lot of other info-material. But nothing of this link was useful. can anyone help me?

So, here is the algorithm to find a node in the tree. Select arbitrary node v Start a DFS from v, and setup subtree sizes Re-position to node v (or start at any arbitrary v that belongs to the tree) Check mathematical condition of centroid for v If condition passed, return current node as centroid Else move to adjacent node with ‘greatest’ subtree size, and back to step 4 And the algorithm for tree decomposition Make the centroid as the root of a new tree (which we will call as the ‘centroid tree’) Recursively decompose the trees in the resulting forest Make the centroids of these trees as children of the centroid which last split them. And here is an example code. https://www.geeksforgeeks.org/centroid-decomposition-of-tree/amp/

### Is best first search optimal and complete?

I have some doubts regarding best first search algorithm. The pseudocode that I have is the following: best first search pseudocode First doubt: is it complete? I have read that it is not because it can enter in a dead end, but I don't know when can happen, because if the algorithm chooses a node that has not more neighbours it does not get stucked in it because this node is remove from the open list and in the next iteration the following node of the open list is treated and the search continues. Second doubt: is it optimal? I thought that if it is visiting the nodes closer to the goal along the search process, then the solution would be the shortest, but it is not in that way and I do not know the reason for that and therefore, the reason that makes this algorithm not optimal. The heuristic I was using is the straight line distance between two points. Thanks for your help!!

Of course, if heuristic function underestimates the costs, best first search is not optimal. In fact, even if your heuristic function is exactly right, best first search is never guaranteed to be optimal. Here is a counter example. Consider the following graph: The green numbers are the actual costs and the red numbers are the exact heuristic function. Let's try to find a path from node S to node G. Best first search would give you S->A->G following the heuristic function. However, if you look at the graph closer, you would see that the path S->B->C->G has lower cost of 5 instead of 6. Thus, this is an example of best first search performing suboptimal under perfect heuristic function.

In general case best first search algorithm is complete as in worst case scenario it will search the whole space (worst option). Now, it should be also optimal - given the heuristic function is admissible - meaning it does not overestimate the cost of the path from any of the nodes to goal. (It also needs to be consistent - that means that it adheres to triangle inequality, if it is not then the algorithm would not be complete - as it could enter a cycle) Checking your algorithm I do not see how the heuristic function is calculated. Also I do not see there is calculated the cost of the path to get to the particular node. So, it needs to calculate the actual cost of the path to reach a particular node and then it needs to add a heuristics estimate of the cost of the path from the node towards goal. The formula is f(n)=g(n)+h(n) where g(n) is the cost of the path to reach the node and h(n) is the heuristics estimating the cost of the cheapest path from n to the goal. Check the implementation of A* algorithm which is an example of best first search on path planning. TLDR In best first search, you need to calculate the cost of a node as a sum of the cost of the path to get to that node and the heuristic function that estimate the cost of the path from that node to the goal. If the heuristic function will be admissible and consistent the algorithm will be optimal and complete.

### A* Graph search

Given the heuristic values h(A)=5, h(B)=1, using A* graph search, it will put A and B on the frontier with f(A)=2+5=7, f(B)=4+1=5, then select B for expansion, then put G on frontier with f(G)=4+4=8, then it will select A for expansion, but will not do anything since both S and B are already expanded and not on frontier, and therefore it will select G next and return a non-optimal solution. Is my argument correct?

There are two heuristic concepts here: Admissible heuristic: When for each node n in the graph, h(n) never overestimates the cost of reaching the goal. Consistent heuristic: When for each node n in the graph and each node m of its successors, h(n) <= h(m) + c(n,m), where c(n,m) is the cost of the arc from n to m. Your heuristic function is admissible but not consistent, since as you have shown: h(A) > h(B) + c(A,B), 5 > 2. If the heuristic is consistent, then the estimated final cost of a partial solution will always grow along the path, i.e. f(n) <= f(m) and as we can see again: f(A) = g(A) + h(A) = 7 > f(B) = g(B) + h(B) = 5, this heuristic function does not satisfy this property. With respect to A*: A* using an admissible heuristic guarantees to find the shortest path from the start to the goal. A* using a consistent heuristic, in addition to find the shortest path, also guarantees that once a node is explored we have already found the shortest path to this node, and therefore no node needs to be reexplored. So, answering your question, A* algorithm has to be implemented to reopen nodes when a shorter path to a node is found (updating also the new path cost), and this new path will be added to the open set or frontier, therefore your argument is not correct, since B has to be added again to the frontier (now with the path S->A->B and cost 3). If you can restrict A* to be used only with consistent heuristic functions then yes, you can discard path to nodes that have been already explored.

You maintain an ordered priority queue of objects on the frontier. You then take the best candidate, expand in all available directions, and put the new nodes in the priority queue. So it's possible for A to be pushed to the back of queue even though in fact the optimal path goes through it. It's also possible for A to be hemmed in by neighbours which were reached through sub-optimal paths, in which case most algorithms won't try to expand it as you say. A star is only an a way of finding a reasonable path, it doesn't find the globally optimal path.

### Best and easiest algorithm to search for a vertex on a Graph?

After implementing most of the common and needed functions for my Graph implementation, I realized that a couple of functions (remove vertex, search vertex and get vertex) don't have the "best" implementation. I'm using adjacency lists with linked lists for my Graph implementation and I was searching one vertex after the other until it finds the one I want. Like I said, I realized I was not using the "best" implementation. I can have 10000 vertices and need to search for the last one, but that vertex could have a link to the first one, which would speed up things considerably. But that's just an hypothetical case, it may or may not happen. So, what algorithm do you recommend for search lookup? Our teachers talked about Breadth-first and Depth-first mostly (and Dikjstra' algorithm, but that's a completely different subject). Between those two, which one do you recommend? It would be perfect if I could implement both but I don't have time for that, I need to pick up one and implement it has the first phase deadline is approaching... My guess, is to go with Depth-first, seems easier to implement and looking at the way they work, it seems a best bet. But that really depends on the input. But what do you guys suggest?

If you’ve got an adjacency list, searching for a vertex simply means traversing that list. You could perhaps even order the list to decrease the needed lookup operations. A graph traversal (such as DFS or BFS) won’t improve this from a performance point of view.

Finding and deleting nodes in a graph is a "search" problem not a graph problem, so to make it better than O(n) = linear search, BFS, DFS, you need to store your nodes in a different data structure optimized for searching or sort them. This gives you O(log n) for find and delete operations. Candidatas are tree structures like b-trees or hash tables. If you want to code the stuff yourself I would go for a hash table which normally gives very good performance and is reasonably easy to implement.

I think BFS would usually be faster an average. Read the wiki pages for DFS and BFS. The reason I say BFS is faster is because it has the property of reaching nodes in order of their distance from your starting node. So if your graph has N nodes and you want to search for node N and node 1, which is the node you start your search form, is linked to N, then you will find it immediately. DFS might expand the whole graph before this happens however. DFS will only be faster if you get lucky, while BFS will be faster if the nodes you search for are close to your starting node. In short, they both depend on the input, but I would choose BFS. DFS is also harder to code without recursion, which makes BFS a bit faster in practice, since it is an iterative algorithm. If you can normalize your nodes (number them from 1 to 10 000 and access them by number), then you can easily keep Exists[i] = true if node i is in the graph and false otherwise, giving you O(1) lookup time. Otherwise, consider using a hash table if normalization is not possible or you don't want to do it.

Depth-first search is best because It uses much less memory Easier to implement

the depth first and breadth first algorithms are almost identical, except for the use of a stack in one (DFS), a queue in the other (BFS), and a few required member variables. Implementing them both shouldn't take you much extra time. Additionally if you have an adjacency list of the vertices then your look up with be O(V) anyway. So little to nothing will be gained via using one of the two other searches.

I'd comment on Konrad's post but I can't comment yet so... I'd like to second that it doesn't make a difference in performance if you implement DFS or BFS over a simple linear search through your list. Your search for a particular node in the graph doesn't depend on the structure of the graph, hence it's not necessary to confine yourself to graph algorithms. In terms of coding time, the linear search is the best choice; if you want to brush up your skills in graph algorithms, implement DFS or BFS, whichever you feel like.

If you are searching for a specific vertex and terminating when you find it, I would recommend using A*, which is a best-first search. The idea is that you calculate the distance from the source vertex to the current vertex you are processing, and then "guess" the distance from the current vertex to the target. You start at the source, calculate the distance (0) plus the guess (whatever that might be) and add it to a priority queue where the priority is distance + guess. At each step, you remove the element with the smallest distance + guess, do the calculation for each vertex in its adjacency list and stick those in the priority queue. Stop when you find the target vertex. If your heuristic (your "guess") is admissible, that is, if it's always an under-estimate, then you are guaranteed to find the shortest path to your target vertex the first time you visit it. If your heuristic is not admissible, then you will have to run the algorithm to completion to find the shortest path (although it sounds like you don't care about the shortest path, just any path). It's not really any more difficult to implement than a breadth-first search (you just have to add the heuristic, really) but it will probably yield faster results. The only hard part is figuring out your heuristic. For vertices that represent geographical locations, a common heuristic is to use an "as-the-crow-flies" (direct distance) heuristic.

Linear search is faster than BFS and DFS. But faster than linear search would be A* with the step cost set to zero. When the step cost is zero, A* will only expand the nodes that are closest to a goal node. If the step cost is zero then every node's path cost is zero and A* won't prioritize nodes with a shorter path. That's what you want since you don't need the shortest path. A* is faster than linear search because linear search will most likely complete after O(n/2) iterations (each node has an equal chance of being a goal node) but A* prioritizes nodes that have a higher chance of being a goal node.