Artificial Intelligence Uniform-cost search - artificial-intelligence

I have some questions about the search functions in artificial intelligence that I cannot understand. I know that Uniform-cost search is a special case of the A* search algorithm if its heuristic is a constant function.
Also I know that Breadth-first search (BFS) is a special case of A* when all edge costs are positive and identical.
Best-first search is also a special case of A* search.
But ho can I show that? How can I prove that all the above are correct?

Erm, I don't really know how to state it elegantly, but everything you said is true by... definition!
In A*, you've got a heuristic function, and you greedily explore your tree selecting the most promising branches.
If the cost for each edge is identical, then A* only starts with the nodes that are at "distance 1" because they all have the smallest cost : 1. Then, A* explores the nodes at "distance 2" from the root node, because their cost is now the smallest from all possible nodes : 2. Recursively, this leads to BFS.
This is identical for Uniform-cost. For Best-First search, it's a bit different, A* is a special case of Best-first search, not the other way round =).

Related

How to implement an A* search algorithm where h(n) = h*(n)?

I'm interested in the steps/logic behind implementing an A* search algorithm if we wanted our h(n) value for every n to be exactly the perfect heuristic value (h*(n)).
Am I correct in the assumption that for each node, we would have to perform 1 A* traversal of the tree from that node till the end node in order to calculate h*(n) for it? I know admissible heuristics aim to get as close to h*(n) as possible to reduce checking nodes/paths that are not optimal.
If your heuristic is "perfect" (ie. it represents the actual distance), that will always be consistent so there's nothing special to do. The next node dequeued will always be the next node in the path.
In fact in this case you can skip A* altogether and just immediately calculate the best path.

Is best first search optimal and complete?

I have some doubts regarding best first search algorithm. The pseudocode that I have is the following:
best first search pseudocode
First doubt: is it complete? I have read that it is not because it can enter in a dead end, but I don't know when can happen, because if the algorithm chooses a node that has not more neighbours it does not get stucked in it because this node is remove from the open list and in the next iteration the following node of the open list is treated and the search continues.
Second doubt: is it optimal? I thought that if it is visiting the nodes closer to the goal along the search process, then the solution would be the shortest, but it is not in that way and I do not know the reason for that and therefore, the reason that makes this algorithm not optimal.
The heuristic I was using is the straight line distance between two points.
Thanks for your help!!
Of course, if heuristic function underestimates the costs, best first search is not optimal. In fact, even if your heuristic function is exactly right, best first search is never guaranteed to be optimal. Here is a counter example. Consider the following graph:
The green numbers are the actual costs and the red numbers are the exact heuristic function. Let's try to find a path from node S to node G.
Best first search would give you S->A->G following the heuristic function. However, if you look at the graph closer, you would see that the path S->B->C->G has lower cost of 5 instead of 6. Thus, this is an example of best first search performing suboptimal under perfect heuristic function.
In general case best first search algorithm is complete as in worst case scenario it will search the whole space (worst option). Now, it should be also optimal - given the heuristic function is admissible - meaning it does not overestimate the cost of the path from any of the nodes to goal. (It also needs to be consistent - that means that it adheres to triangle inequality, if it is not then the algorithm would not be complete - as it could enter a cycle)
Checking your algorithm I do not see how the heuristic function is calculated. Also I do not see there is calculated the cost of the path to get to the particular node.
So, it needs to calculate the actual cost of the path to reach a particular node and then it needs to add a heuristics estimate of the cost of the path from the node towards goal.
The formula is f(n)=g(n)+h(n) where g(n) is the cost of the path to reach the node and h(n) is the heuristics estimating the cost of the cheapest path from n to the goal.
Check the implementation of A* algorithm which is an example of best first search on path planning.
TLDR In best first search, you need to calculate the cost of a node as a sum of the cost of the path to get to that node and the heuristic function that estimate the cost of the path from that node to the goal. If the heuristic function will be admissible and consistent the algorithm will be optimal and complete.

best-first Vs. breadth-first

What is the difference between best-first-search and the breadth-first-search ? and which one do we call "BFS" ?
To answer your second question first:
which one do we call "BFS" ?
Typically when we refer to BFS, we are talking Breadth-first Search.
What is the difference between best-first-search and the breadth-first-search
The analogy that I like to consult when comparing such algorithms is robots digging for gold.
Given a hill, our goal is to simply find gold.
Breadth-first search has no prior knowledge of the whereabouts of the gold so the robot simply digs 1 foot deep along the 10-foot strip if it doesn't find any gold, it digs 1 foot deeper.
Best-first search, however, has a built-in metal detector, thus meaning it has prior knowledge. There is, of course, the cost in having a metal detector, and cost in turning it on and seeing which place would be the best to start digging.
Best-first search is informed whereas Breadth-first search is uninformed, as in one has a metal detector and the other doesn't!
Breadth-first search is complete, meaning it'll find a solution if one exists, and given enough resources will find the optimal solution.
Best-first search is also complete provided the heuristic — estimator of the cost/ so the prior knowledge — is admissible — meaning it overestimates the cost of getting to the solution)
I got the BFS image from http://slideplayer.com/slide/9063462/ the Best-first search is my failed attempt at photoshop!
Thats 2 algorithms to search a graph (tree).
Breadth first looks at all elements(nodes) of a certain depth, trying to find a solutuion (searched value or whatever) then continous one level deeper and looks at every node and so on.
Best first looks at the "best" node defined mostly by a heuristic, checks the best subnode of that node and so on.
A* would be an example for heursitic (best first search) and its way faster. But you need a heuristic what you wouldn't need for breadth search.
Creating a heuristic needs some own effort. Breadth first is out of the box.

how search algorithm exactly behaves with this function: f(n)=3h(n)

Imagine I am performing a best-first search based on the function f(n)=cg(n)+(3-c)h(n) for selecting the next node to expand. If I use c=0 I will get f(n)=3h(n). Can I say that with c=0 the search algorithm behaves exactly like Best first search or greedy best first search?
(I am in doubt between the two. My answer is yes because it just looks ahead and do not consider g(n) and also my feeling is best first search because it overestimate by multiplying by 3 so it is not greedy but I am not sure if I am right.)
You are referring to algorithms like A* which perform a best-first search based on f-cost, where f(n) = g(n) + h(n) and the g-cost of a node is the cost to reach that node, while h-cost is the estimated cost to reach the goal.
Dijkstra's algorithm uses f(n) = g(n).
Pure heuristic search or greedy best-first search uses f(n) = h(n).
Your question is what happens if I have:
f(n) = c*g(n) + (3-c)*h(n)
When c = 0 this reduces to:
f(n) = (3)*h(n)
The constant of 3 here has no influence on the search order, because all nodes are weighted equally in the same way. So, this is closest to a greedy best-first search.

Best and easiest algorithm to search for a vertex on a Graph?

After implementing most of the common and needed functions for my Graph implementation, I realized that a couple of functions (remove vertex, search vertex and get vertex) don't have the "best" implementation.
I'm using adjacency lists with linked lists for my Graph implementation and I was searching one vertex after the other until it finds the one I want. Like I said, I realized I was not using the "best" implementation. I can have 10000 vertices and need to search for the last one, but that vertex could have a link to the first one, which would speed up things considerably. But that's just an hypothetical case, it may or may not happen.
So, what algorithm do you recommend for search lookup? Our teachers talked about Breadth-first and Depth-first mostly (and Dikjstra' algorithm, but that's a completely different subject). Between those two, which one do you recommend?
It would be perfect if I could implement both but I don't have time for that, I need to pick up one and implement it has the first phase deadline is approaching...
My guess, is to go with Depth-first, seems easier to implement and looking at the way they work, it seems a best bet. But that really depends on the input.
But what do you guys suggest?
If you’ve got an adjacency list, searching for a vertex simply means traversing that list. You could perhaps even order the list to decrease the needed lookup operations.
A graph traversal (such as DFS or BFS) won’t improve this from a performance point of view.
Finding and deleting nodes in a graph is a "search" problem not a graph problem, so to make it better than O(n) = linear search, BFS, DFS, you need to store your nodes in a different data structure optimized for searching or sort them. This gives you O(log n) for find and delete operations. Candidatas are tree structures like b-trees or hash tables. If you want to code the stuff yourself I would go for a hash table which normally gives very good performance and is reasonably easy to implement.
I think BFS would usually be faster an average. Read the wiki pages for DFS and BFS.
The reason I say BFS is faster is because it has the property of reaching nodes in order of their distance from your starting node. So if your graph has N nodes and you want to search for node N and node 1, which is the node you start your search form, is linked to N, then you will find it immediately. DFS might expand the whole graph before this happens however. DFS will only be faster if you get lucky, while BFS will be faster if the nodes you search for are close to your starting node. In short, they both depend on the input, but I would choose BFS.
DFS is also harder to code without recursion, which makes BFS a bit faster in practice, since it is an iterative algorithm.
If you can normalize your nodes (number them from 1 to 10 000 and access them by number), then you can easily keep Exists[i] = true if node i is in the graph and false otherwise, giving you O(1) lookup time. Otherwise, consider using a hash table if normalization is not possible or you don't want to do it.
Depth-first search is best because
It uses much less memory
Easier to implement
the depth first and breadth first algorithms are almost identical, except for the use of a stack in one (DFS), a queue in the other (BFS), and a few required member variables. Implementing them both shouldn't take you much extra time.
Additionally if you have an adjacency list of the vertices then your look up with be O(V) anyway. So little to nothing will be gained via using one of the two other searches.
I'd comment on Konrad's post but I can't comment yet so... I'd like to second that it doesn't make a difference in performance if you implement DFS or BFS over a simple linear search through your list. Your search for a particular node in the graph doesn't depend on the structure of the graph, hence it's not necessary to confine yourself to graph algorithms. In terms of coding time, the linear search is the best choice; if you want to brush up your skills in graph algorithms, implement DFS or BFS, whichever you feel like.
If you are searching for a specific vertex and terminating when you find it, I would recommend using A*, which is a best-first search.
The idea is that you calculate the distance from the source vertex to the current vertex you are processing, and then "guess" the distance from the current vertex to the target.
You start at the source, calculate the distance (0) plus the guess (whatever that might be) and add it to a priority queue where the priority is distance + guess. At each step, you remove the element with the smallest distance + guess, do the calculation for each vertex in its adjacency list and stick those in the priority queue. Stop when you find the target vertex.
If your heuristic (your "guess") is admissible, that is, if it's always an under-estimate, then you are guaranteed to find the shortest path to your target vertex the first time you visit it. If your heuristic is not admissible, then you will have to run the algorithm to completion to find the shortest path (although it sounds like you don't care about the shortest path, just any path).
It's not really any more difficult to implement than a breadth-first search (you just have to add the heuristic, really) but it will probably yield faster results. The only hard part is figuring out your heuristic. For vertices that represent geographical locations, a common heuristic is to use an "as-the-crow-flies" (direct distance) heuristic.
Linear search is faster than BFS and DFS. But faster than linear search would be A* with the step cost set to zero. When the step cost is zero, A* will only expand the nodes that are closest to a goal node. If the step cost is zero then every node's path cost is zero and A* won't prioritize nodes with a shorter path. That's what you want since you don't need the shortest path.
A* is faster than linear search because linear search will most likely complete after O(n/2) iterations (each node has an equal chance of being a goal node) but A* prioritizes nodes that have a higher chance of being a goal node.

Resources