why DFS is not optimal but BFs is optimal - artificial-intelligence

I have this question in my mind for long but never got reasonable answer for that :
Usually in artifitial intelligent course when it comes to search it is always said that BFS is optimal but DFS is not but I can come up with many example that shows with DFS we can even get the answer faster. So can anyone explain it ? Am I missing something?

Optimal as in "produces the optimal path", not "is the fastest algorithm possible". When searching a state space for a path to a goal, DFS may produce a much longer path than BFS. Note that BFS is only optimal when actions are unweighted; if different actions have different weights, you need something like A*.

This is because by optimal strategy they mean the one whose returned solution maximizes the utility.
Regarding this, nothing guarantees that the first solution found by DFS s optimal. Also BFS is not optimal in a general sense, so your statement as-is is wrong.
The main point here is about being guaranteed that a certain search strategy will always return the optimal result. Any strategy might be the best in a given case, but often (especially in AI), you don't know in what specific case you are, at most you have some hypothesis on that case.
However, DFS is optimal when the search tree is finite, all action costs are identical and all solutions have the same length. However limitating this may sound, there is an important class of problems that satisfies these conditions: the CSPs (constraint satisfaction problems). Maybe all the examples you thought about fall in this (rather common) category.

You can refer to my answer to this question in which I explain why DFS is not optimal and why BFS is not the best choice for solving uninformed state-space search problems.

You can refer to the link below it considers an eg of a tree and solves it with both the approaches.
link

Related

Optimizing functions with GAs

Firstly, sorry if this is not the right stack exchange for this question it might fit better on math.
I've been working on a project to maximize a functions output using a GA. However, from the limited calculus I know I thought there were methods to find the maximum of a mathematical function using calculus? I'd assume the reason GAs are sometimes used to maximize functions is because there are functions where the mathematical methods don't work. I wondered what conditions those were? Maybe that it's not continuous or differentiable?
A superficial explanation
For simple™ mathematical functions, the solution would be to use your calculus and find the derivate function f'(x). If it's not mathematically possible to differentiate the error function f(x), you need to break out the other tools from you math-box. If the error function's solution space is convex, you could possibly use a numerical approach to find your optimum such as the gradient descent or the conjugate gradient algorithm.
The Genetic Algorithm (and other search algorithms) comes in handy if the function you are trying to optimize consists of multiple undefined variables. This would make calculating the optimum using calculus very difficult. If you are familiar with Neural Networks: the genetic algorithm has been applied to find optimal weight configurations for neural networks. In these problem instances, there might be thousands of unknown variables (weights).
A mathematical approach would have to search the solution space in some incremental approach, the genetic algorithm is a bit "all over the place"™. By adjusting the mutation frequency, the GA would be able to jump around in the search space.
A (oversimplified) difficult solution space:
Image: Ciumac Sergiu
Well, for starters, you don't always have easily differentiable functions. You may especially have very high dimensional functions, which can be very difficult to differentiate.
Moreover, even if you do have a function you can differentiate, what you find are local optima, not global optima, and you may end up finding a great many of them-- potentially infinite numbers of them-- with no clear way to decide which are better than others.
While you may know enough about some particular function to be able to optimize with calculus, there is no method guaranteed to go you the global optimum for any possible function. Hence we rely on a number of probabilistic techniques and heuristics, of which genetic algorithms are only one.

Is an optimal algorithm a complete algorithm?

I do understand that a complete algorithm is one where if there is a solution, the algorithm is able to find it and that optimal algorithm is one where it manages to find a least cost solution.
But is an optimal algorithm, a complete algorithm? Can please briefly explain?
Thanks.
Yes, by definition. Finding the optimal solution entails proving optimality. This can be done by finding all solutions or by proving that no solution can have better cost than the one found already. In either case, at least one solution has to be found.
If there is no solution, neither an optimal nor a complete algorithm would find one of course.
The notion of completeness refers to the ability of the algorithm to find a solution if one exists, and if not, to report that no solution is possible.
If an algorithm can find an solution if it exists but it's not capable to "say" that there is no solution in cases when there are no solution, then it's not complete.
Yes. In simple terms
completeness defines :
If a solution is possible, it ensures the solution. (Is it guaranteed or not ?)
Optimal :
Ensures the best solution will be found or not ?
Therefore according to your problem if an algorithm is optimal, it tells the best solution is found. Then automatically it ensures the completeness of the algorithm because it already found the solution (as guaranteed).

Cache Oblivious Search

Please forgive this stupid question, but I didn't find any hint by googling it.
If I have an array (contiguous memory), and I search sequentially for a given pattern (for example build the list of all even numbers), am I using a cache-oblivious algorithm? Yes it's quite stupid as an algorithm, but I'm trying to understand here :)
Yes, you are using a cache-oblivious algorithm since your running time is O(N/B) - i.e. # of disk transfers, which is dependent on the block size, but your algorithm doesn't depend on a particular value of the block size. Additionally, this means that you are both cache-oblivious as well as cache-efficient.

PACMAN: a short path for eating all the dots

I am trying to find a solution for the PACMAN problem of finding a short path (not the shortest, but a good one) that eats all the dots in a big maze. I've seen a lot of people talking about TSP, Dijsktra, BFS, A*. I don't think this is a TSP since I don't have to go back where I started and I can repeat node if I want. And I don't think Dijsktra, BFS and A* would help because I'm not looking for the shortest path and even if that was the case, it wouldn't give a answer in a reasonable time.
Could anyone give me hints on this? What kind of problem is this? Is this a kind of TSP? What kind of algorithms approach this problem in a efficient way? I'd appreciate any hints on implementation.
I take it you're trying to do contest where you find the shortest path in the big maze in under 30 seconds?
I actually did this last year for fun (my college class didn't do the contest). After weeks of research, I was able to do an exact solution of the maze in under 30 seconds.
The heuristic I used was actually an exact heuristic. I wrote a bunch of code to find the minimal path length using a much more efficient algorithm based on graph decomposition and dynamic programming, and then fed the results back into A* as the 'heuristic' value.
The key thing to realize is that while the graph is very big (273 nodes), it has a low carving width (5), meaning that it can be solved efficiently using a fixed parameter tractable algorithm.
Hopefully that's enough keywords to get you on the right track.
Update: I wrote a blog post explaining the solution

Did you apply computational complexity theory in real life?

I'm taking a course in computational complexity and have so far had an impression that it won't be of much help to a developer.
I might be wrong but if you have gone down this path before, could you please provide an example of how the complexity theory helped you in your work? Tons of thanks.
O(1): Plain code without loops. Just flows through. Lookups in a lookup table are O(1), too.
O(log(n)): efficiently optimized algorithms. Example: binary tree algorithms and binary search. Usually doesn't hurt. You're lucky if you have such an algorithm at hand.
O(n): a single loop over data. Hurts for very large n.
O(n*log(n)): an algorithm that does some sort of divide and conquer strategy. Hurts for large n. Typical example: merge sort
O(n*n): a nested loop of some sort. Hurts even with small n. Common with naive matrix calculations. You want to avoid this sort of algorithm if you can.
O(n^x for x>2): a wicked construction with multiple nested loops. Hurts for very small n.
O(x^n, n! and worse): freaky (and often recursive) algorithms you don't want to have in production code except in very controlled cases, for very small n and if there really is no better alternative. Computation time may explode with n=n+1.
Moving your algorithm down from a higher complexity class can make your algorithm fly. Think of Fourier transformation which has an O(n*n) algorithm that was unusable with 1960s hardware except in rare cases. Then Cooley and Tukey made some clever complexity reductions by re-using already calculated values. That led to the widespread introduction of FFT into signal processing. And in the end it's also why Steve Jobs made a fortune with the iPod.
Simple example: Naive C programmers write this sort of loop:
for (int cnt=0; cnt < strlen(s) ; cnt++) {
/* some code */
}
That's an O(n*n) algorithm because of the implementation of strlen(). Nesting loops leads to multiplication of complexities inside the big-O. O(n) inside O(n) gives O(n*n). O(n^3) inside O(n) gives O(n^4). In the example, precalculating the string length will immediately turn the loop into O(n). Joel has also written about this.
Yet the complexity class is not everything. You have to keep an eye on the size of n. Reworking an O(n*log(n)) algorithm to O(n) won't help if the number of (now linear) instructions grows massively due to the reworking. And if n is small anyway, optimizing won't give much bang, too.
While it is true that one can get really far in software development without the slightest understanding of algorithmic complexity. I find I use my knowledge of complexity all the time; though, at this point it is often without realizing it. The two things that learning about complexity gives you as a software developer are a way to compare non-similar algorithms that do the same thing (sorting algorithms are the classic example, but most people don't actually write their own sorts). The more useful thing that it gives you is a way to quickly describe an algorithm.
For example, consider SQL. SQL is used every day by a very large number of programmers. If you were to see the following query, your understanding of the query is very different if you've studied complexity.
SELECT User.*, COUNT(Order.*) OrderCount FROM User Join Order ON User.UserId = Order.UserId
If you have studied complexity, then you would understand if someone said it was O(n^2) for a certain DBMS. Without complexity theory, the person would have to explain about table scans and such. If we add an index to the Order table
CREATE INDEX ORDER_USERID ON Order(UserId)
Then the above query might be O(n log n), which would make a huge difference for a large DB, but for a small one, it is nothing at all.
One might argue that complexity theory is not needed to understand how databases work, and they would be correct, but complexity theory gives a language for thinking about and talking about algorithms working on data.
For most types of programming work the theory part and proofs may not be useful in themselves but what they're doing is try to give you the intuition of being able to immediately say "this algorithm is O(n^2) so we can't run it on these one million data points". Even in the most elementary processing of large amounts of data you'll be running into this.
Thinking quickly complexity theory has been important to me in business data processing, GIS, graphics programming and understanding algorithms in general. It's one of the most useful lessons you can get from CS studies compared to what you'd generally self-study otherwise.
Computers are not smart, they will do whatever you instruct them to do. Compilers can optimize code a bit for you, but they can't optimize algorithms. Human brain works differently and that is why you need to understand the Big O. Consider calculating Fibonacci numbers. We all know F(n) = F(n-1) + F(n-2), and starting with 1,1 you can easily calculate following numbers without much effort, in linear time. But if you tell computer to calculate it with that formula (recursively), it wouldn't be linear (at least, in imperative languages). Somehow, our brain optimized algorithm, but compiler can't do this. So, you have to work on the algorithm to make it better.
And then, you need training, to spot brain optimizations which look so obvious, to see when code might be ineffective, to know patterns for bad and good algorithms (in terms of computational complexity) and so on. Basically, those courses serve several things:
understand executional patterns and data structures and what effect they have on the time your program needs to finish;
train your mind to spot potential problems in algorithm, when it could be inefficient on large data sets. Or understand the results of profiling;
learn well-known ways to improve algorithms by reducing their computational complexity;
prepare yourself to pass an interview in the cool company :)
It's extremely important. If you don't understand how to estimate and figure out how long your algorithms will take to run, then you will end up writing some pretty slow code. I think about compuational complexity all the time when writing algorithms. It's something that should always be on your mind when programming.
This is especially true in many cases because while your app may work fine on your desktop computer with a small test data set, it's important to understand how quickly your app will respond once you go live with it, and there are hundreds of thousands of people using it.
Yes, I frequently use Big-O notation, or rather, I use the thought processes behind it, not the notation itself. Largely because so few developers in the organization(s) I frequent understand it. I don't mean to be disrespectful to those people, but in my experience, knowledge of this stuff is one of those things that "sorts the men from the boys".
I wonder if this is one of those questions that can only receive "yes" answers? It strikes me that the set of people that understand computational complexity is roughly equivalent to the set of people that think it's important. So, anyone that might answer no perhaps doesn't understand the question and therefore would skip on to the next question rather than pause to respond. Just a thought ;-)
There are points in time when you will face problems that require thinking about them. There are many real world problems that require manipulation of large set of data...
Examples are:
Maps application... like Google Maps - how would you process the road line data worldwide and draw them? and you need to draw them fast!
Logistics application... think traveling sales man on steroids
Data mining... all big enterprises requires one, how would you mine a database containing 100 tables and 10m+ rows and come up with a useful results before the trends get outdated?
Taking a course in computational complexity will help you in analyzing and choosing/creating algorithms that are efficient for such scenarios.
Believe me, something as simple as reducing a coefficient, say from T(3n) down to T(2n) can make a HUGE differences when the "n" is measured in days if not months.
There's lots of good advice here, and I'm sure most programmers have used their complexity knowledge once in a while.
However I should say understanding computational complexity is of extreme importance in the field of Games! Yes you heard it, that "useless" stuff is the kind of stuff game programming lives on.
I'd bet very few professionals probably care about the Big-O as much as game programmers.
I use complexity calculations regularly, largely because I work in the geospatial domain with very large datasets, e.g. processes involving millions and occasionally billions of cartesian coordinates. Once you start hitting multi-dimensional problems, complexity can be a real issue, as greedy algorithms that would be O(n) in one dimension suddenly hop to O(n^3) in three dimensions and it doesn't take much data to create a serious bottleneck. As I mentioned in a similar post, you also see big O notation becoming cumbersome when you start dealing with groups of complex objects of varying size. The order of complexity can also be very data dependent, with typical cases performing much better than general cases for well designed ad hoc algorithms.
It is also worth testing your algorithms under a profiler to see if what you have designed is what you have achieved. I find most bottlenecks are resolved much better with algorithm tweaking than improved processor speed for all the obvious reasons.
For more reading on general algorithms and their complexities I found Sedgewicks work both informative and accessible. For spatial algorithms, O'Rourkes book on computational geometry is excellent.
In your normal life, not near a computer you should apply concepts of complexity and parallel processing. This will allow you to be more efficient. Cache coherency. That sort of thing.
Yes, my knowledge of sorting algorithms came in handy one day when I had to sort a stack of student exams. I used merge sort (but not quicksort or heapsort). When programming, I just employ whatever sorting routine the library offers. ( haven't had to sort really large amount of data yet.)
I do use complexity theory in programming all the time, mostly in deciding which data structures to use, but also in when deciding whether or when to sort things, and for many other decisions.
'yes' and 'no'
yes) I frequently use big O-notation when developing and implementing algorithms.
E.g. when you should handle 10^3 items and complexity of the first algorithm is O(n log(n)) and of the second one O(n^3), you simply can say that first algorithm is almost real time while the second require considerable calculations.
Sometimes knowledges about NP complexities classes can be useful. It can help you to realize that you can stop thinking about inventing efficient algorithm when some NP-complete problem can be reduced to the problem you are thinking about.
no) What I have described above is a small part of the complexities theory. As a result it is difficult to say that I use it, I use minor-minor part of it.
I should admit that there are many software development project which don't touch algorithm development or usage of them in sophisticated way. In such cases complexity theory is useless. Ordinary users of algorithms frequently operate using words 'fast' and 'slow', 'x seconds' etc.
#Martin: Can you please elaborate on the thought processes behind it?
it might not be so explicit as sitting down and working out the Big-O notation for a solution, but it creates an awareness of the problem - and that steers you towards looking for a more efficient answer and away from problems in approaches you might take. e.g. O(n*n) versus something faster e.g. searching for words stored in a list versus stored in a trie (contrived example)
I find that it makes a difference with what data structures I'll choose to use, and how I'll work on large numbers of records.
A good example could be when your boss tells you to do some program and you can demonstrate by using the computational complexity theory that what your boss is asking you to do is not possible.

Resources