I remember when I was in college we went over some problem where there was a smart agent that was on a grid of squares and it had to clean the squares. It was awarded points for cleaning. It also was deducted points for moving. It had to refuel every now and then and at the end it got a final score based on how many squares on the grid were dirty or clean.
I'm trying to study that problem since it was very interesting when I saw it in college, however I cannot find anything on wikipedia or anywhere online. Is there a specific name for that problem that you know about? Or maybe it was just something my teacher came up with for the class.
I'm searching for AI cleaning agent and similar things, but I don't find anything. I don't know, I'm thinking maybe it has some other name.
If you know where I can find more information about this problem I would appreciate it. Thanks.
Perhaps a "stigmergy" approach is closely related to your problem. There is a starting point here, and you can find something by searching for "dead ants" and "robots" on google scholar.
Basically: instead of modelling a precise strategy you work toward a probabilistic approach. Ants (probably) collect their deads by piling up according to a simple rule such as "if there is a pile of dead ants there, I bring this corpse hither; otherwise, I'll make a new pile". You can start by simplifying your 'cleaning' situation with that, and see where you go.
Also, I think (another?) suitable approach could be modelled with a Genetic Algorithm using a carefully chosen combination of fitness functions such as:
the end number of 'clean' tiles
the number of steps made by the robot
of course if the robots 'dies' out of starvation it automatically removes itself from the gene pool, a-la darwin awards :)
You could start by modelling a very, very simple genotype that will be 'computed' into a behaviour. Consider using a simple GA such as this one by Inman Harvey, then to each gene assign either a part of the strategy, or a complete behaviour. E.g.: if gene A is turned to 1 then the robot will try to wander randomly; if gene B is also turned to 1, then it will give priority to self-charging unless there are dirty tiles at distance X. Or use floats and model probability. Your mileage may vary but I can assure it will be fun :)
The problem is reminiscent of Shakey, although there's cleaning involved (which is like the Roomba -- a device that can also be programmed to perform these very tasks).
If the "problem space" (or room) is small enough, you can solve for an optimal solution using a simple A*-based search, but likely it won't be, since that won't leave for very interesting problems.
The machine learning approach suggested here using genetic algorithms is an interesting approach. Given the problem domain you would only have one "rule" (a move-to action, since clean could be eliminated by implicitly cleaning any square you move to that is dirty) so your learner would essentially be learning how to move around an environment. The problem there would be to build a learner that would be adaptable to any given floor plan, instead of just becoming proficient at cleaning a very specific space.
Whatever approach you have, I'd also consider doing a further meta-reasoning step if the problem sets are big enough, and use a partition approach to divide the floor up into separate areas and then conquering them one at a time.
Can you use techniques to create data to use "offline"? In that case, I'd even consider creating a "database" of optimal routes to take to clean certain floor spaces (1x1 up to, say, 5x5) that include all possible start and end squares. This is similar to "endgame databases" that game AIs use to effectively "solve" games once they reach a certain depth (c.f. Chinook).
This problem reminds me of this. A similar problem is briefly mentioned in the book Complexity as an example of a genetic algorithm. These versions are simplified though, they don't take into account fuel consumption.
Related
In backpropagation training, during gradient descent down the error surface, network with large amount of neurons in hidden layer can get stuck in local minimum. I have read that reinitializing weights to random numbers in all cases will eventually avoid this problem.
This means that there always IS a set of "correct" initial weight values. (Is this safe to assume?)
I need to find or make an algorithm that finds them.
I have tried googling the algorithm, tried devising it myself but to no avail. Can anyone propose a solution? Perhaps a name of algorithm that I can search for?
Note: this is a regular feed-forward 3-layer burrito :)
Note: I know attempts have been made to use GAs for that purpose, but that requires re-training the network on each iteration which is time costly when it gets large enough.
Thanks in advance.
There is never a guarantee that you will not get stuck in a local optimum, sadly. Unless you can prove certain properties about the function you are trying to optimize, local optima exist and hill-climbing methods will fall prey to them. (And typically, if you can prove the things you need to prove, you can also select a better tool than a neural network.)
One classic technique is to gradually reduce the learning rate, then increase it and slowly draw it down, again, several times. Raising the learning rate reduces the stability of the algorithm, but gives the algorithm the ability to jump out of a local optimum. This is closely related to simulated annealing.
I am surprised that Google has not helped you, here, as this is a topic with many published papers: Try terms like, "local minima" and "local minima problem" in conjunction with neural networks and backpropagation. You should see many references to improved backprop methods.
For my bachelor's thesis I want to write a genetic algorithm that learns to play the game of Stratego (if you don't know this game, it's probably safe to assume I said chess). I haven't ever before done actual AI projects, so it's an eye-opener to see how little I actually know of implementing things.
The thing I'm stuck with is coming up with a good representation for an actual strategy. I'm probably making some thinking error, but some problems I encounter:
I don't assume you would have a representation containing a lot of
transitions between board positions, since that would just be
bruteforcing it, right?
What could branches of a decision tree look
like? Any representation I come up with don't have interchangeable
branches... If I were to use a bit string, which is apparently also
common, what would the bits represent?
Do I assign scores to the distance between certain pieces? How would I represent that?
I think I ought to know these things after three+ years of study, so I feel pretty stupid - this must look likeI have no clue at all. Still, any help or tips on what to Google would be appreciated!
I think, you could define a decision model and then try to optimize the parameters of that model. You can create multi-stage decision models also. I once did something similar for solving a dynamic dial-a-ride problem (paper here) by modeling it as a two stage linear decision problem. To give you an example, you could:
For each of your figures decide which one is to move next. Each figure is characterized by certain features derived from its position on the board, e.g. ability to make a score, danger, protecting x other figures, and so on. Each of these features can be combined (e.g. in a linear model, through a neural network, through a symbolic expression tree, a decision tree, ...) and give you a rank on which figure to act next with.
Acting with the figure you selected. Again there are a certain number of actions that can be taken, each has certain features. Again you can combine and rank them and one action will have the highest priority. This is the one you choose to perform.
The features you extract can be very simple or insanely complex, it's up to what you think will work best vs what takes how long to compute.
To evaluate and improve the quality of your decision model you can then simulate these decisions in several games against opponents and train the parameters of the model that combines these features to rank the moves (e.g. using a GA). This way you tune the model to win as many games as possible against the specified opponents. You can test the generality of that model by playing against opponents it has not seen before.
As Mathew Hall just said, you can use GP for this (if your model is a complex rule), but this is just one kind of model. In my case a linear combination of the weights did very well.
Btw, if you're interested we've also got a software on heuristic optimization which provides you with GA, GP and that stuff. It's called HeuristicLab. It's GPL and open source, but comes with a GUI (Windows). We've some Howto on how to evaluate the fitness function in an external program (data exchange using protocol buffers), so you can work on your simulation and your decision model and let the algorithms present in HeuristicLab optimize your parameters.
Vincent,
First, don't feel stupid. You've been (I infer) studying basic computer science for three years; now you're applying those basic techniques to something pretty specialized-- a particular application (Stratego) in a narrow field (artificial intelligence.)
Second, make sure your advisor fully understands the rules of Stratego. Stratego is played on a larger board, with more pieces (and more types of pieces) than chess. This gives it a vastly larger space of legal positions, and a vastly larger space of legal moves. It is also a game of hidden information, increasing the difficulty yet again. Your advisor may want to limit the scope of the project, e.g., concentrate on a variant with full observation. I don't know why you think this is simpler, except that the moves of the pieces are a little simpler.
Third, I think the right thing to do at first is to take a look at how games in general are handled in the field of AI. Russell and Norvig, chapters 3 (for general background) and 5 (for two player games) are pretty accessible and well-written. You'll see two basic ideas: One, that you're basically performing a huge search in a tree looking for a win, and two, that for any non-trivial game, the trees are too large, so you search to a certain depth and then cop out with a "board evaluation function" and look for one of those. I think your third bullet point is in this vein.
The board evaluation function is the magic, and probably a good candidate for using either a genetic algorithm, or a genetic program, either of which might be used in conjunction with a neural network. The basic idea is that you are trying to design (or evolve, actually) a function that takes as input a board position, and outputs a single number. Large numbers correspond to strong positions, and small numbers to weak positions. There is a famous paper by Chellapilla and Fogel showing how to do this for a game of Checkers:
http://library.natural-selection.com/Library/1999/Evolving_NN_Checkers.pdf
I think that's a great paper, tying three great strands of AI together: Adversarial search, genetic algorithms, and neural networks. It should give you some inspiration about how to represent your board, how to think about board evaluations, etc.
Be warned, though, that what you're trying to do is substantially more complex than Chellapilla and Fogel's work. That's okay-- it's 13 years later, after all, and you'll be at this for a while. You're still going to have a problem representing the board, because the AI player has imperfect knowledge of its opponent's state; initially, nothing is known but positions, but eventually as pieces are eliminated in conflict, one can start using First Order Logic or related techniques to start narrowing down individual pieces, and possibly even probabilistic methods to infer information about the whole set. (Some of these may be beyond the scope of an undergrad project.)
The fact you are having problems coming up with a representation for an actual strategy is not that surprising. In fact I would argue that it is the most challenging part of what you are attempting. Unfortunately, I haven't heard of Stratego so being a bit lazy I am going to assume you said chess.
The trouble is that a chess strategy is rather a complex thing. You suggest in your answer containing lots of transitions between board positions in the GA, but a chess board has more possible positions than the number of atoms in the universe this is clearly not going to work very well. What you will likely need to do is encode in the GA a series of weights/parameters that are attached to something that takes in the board position and fires out a move, I believe this is what you are hinting at in your second suggestion.
Probably the simplest suggestion would be to use some sort of generic function approximation like a neural network; Perceptrons or Radial Basis Functions are two possibilities. You can encode weights for the various nodes into the GA, although there are other fairly sound ways to train a neural network, see Backpropagation. You could perhaps encode the network structure instead/as well, this also has the advantage that I am pretty sure a fair amount of research has been done into developing neural networks with a genetic algorithm so you wouldn't be starting completely from scratch.
You still need to come up with how you are going to present the board to the neural network and interpret the result from it. Especially, with chess you would have to take note that a lot of moves will be illegal. It would be very beneficial if you could encode the board and interpret the result such that only legal moves are presented. I would suggest implementing the mechanics of the system and then playing around with different board representations to see what gives good results. A few ideas top of the head ideas to get you started could be, although I am not really convinced any of them are especially great ways to do this:
A bit string with all 64 squares one after another with a number presenting what is present in each square. Most obvious, but probably a rather bad representation as a lot of work will be required to filter out illegal moves.
A bit string with all 64 squares one after another with a number presenting what can move to each square. This has the advantage of embodying the covering concept of chess where you what to gain as much coverage of the board with your pieces as possible, but still has problems with illegal moves and dealing with friendly/enemy pieces.
A bit string with all 32 pieces one after another with a number presenting the location of that piece in each square.
In general though I would suggest that chess is rather a complex game to start with, I think it will be rather hard to get something playing to standard which is noticeably better than random. I don't know if Stratego is any simpler, but I would strongly suggest you opt for a fairly simple game. This will let you focus on getting the mechanics of the implementation correct and the representation of the game state.
Anyway hope that is of some help to you.
EDIT: As a quick addition it is worth looking into how standard chess AI's work, I believe most use some sort of Minimax system.
When you say "tactic", do you mean you want the GA to give you a general algorithm to play the game (i.e. evolve an AI) or do you want the game to use a GA to search the space of possible moves to generate a move at each turn?
If you want to do the former, then look into using Genetic programming (GP). You could try to use it to produce the best AI you can for a fixed tree size. JGAP already comes with support for GP as well. See the JGAP Robocode example for an instance of this. This approach does mean you need a domain specific language for a Stratego AI, so you'll need to think carefully how you expose the board and pieces to it.
Using GP means your fitness function can just be how well the AI does at a fixed number of pre-programmed games, but that requires a good AI player to start with (or a very patient human).
#DonAndre's answer is absolutely correct for movement. In general, problems involving state-based decisions are hard to model with GAs, requiring some form of GP (either explicit or, as #DonAndre suggested, trees that are essentially declarative programs).
A general Stratego player seems to me quite challenging, but if you have a reasonable Stratego playing program, "Setting up your Stratego board" would be an excellent GA problem. The initial positions of your pieces would be the phenotype and the outcome of the external Stratego-playing code would be the fitness. It is intuitively likely that random setups would be disadvantaged versus setups that have a few "good ideas" and that small "good ideas" could be combined into fitter-and-fitter setups.
...
On the general problem of what a decision tree, even trying to come up with a simple example, I kept finding it hard to come up with a small enough example, but maybe in the case where you are evaluation whether to attack a same-ranked piece (which, IIRC destroys both you and the other piece?):
double locationNeed = aVeryComplexDecisionTree();
if(thatRank == thisRank){
double sacrificeWillingness = SACRIFICE_GENETIC_BASE; //Assume range 0.0 - 1.0
double sacrificeNeed = anotherComplexTree(); //0.0 - 1.0
double sacrificeInContext = sacrificeNeed * SACRIFICE_NEED_GENETIC_DISCOUNT; //0.0 - 1.0
if(sacrificeInContext > sacrificeNeed){
...OK, this piece is "willing" to sacrifice itself
One way or the other, the basic idea is that you'd still have a lot of coding of Stratego-play, you'd just be seeking places where you could insert parameters that would change the outcome. Here I had the idea of a "base" disposition to sacrifice itself (presumably higher in common pieces) and a "discount" genetically-determined parameter that would weight whether the piece would "accept or reject" the need for a sacrifice.
Questions
I want to classify/categorize/cluster/group together a set of several thousand websites. There's data that we can train on, so we can do supervised learning, but it's not data that we've gathered and we're not adamant about using it -- so we're also considering unsupervised learning.
What features can I use in a machine learning algorithm to deal with multilingual data? Note that some of these languages might not have been dealt with in the Natural Language Processing field.
If I were to use an unsupervised learning algorithm, should I just partition the data by language and deal with each language differently? Different languages might have different relevant categories (or not, depending on your psycholinguistic theoretical tendencies), which might affect the decision to partition.
I was thinking of using decision trees, or maybe Support Vector Machines (SVMs) to allow for more features (from my understanding of them). This post suggests random forests instead of SVMs. Any thoughts?
Pragmatical approaches are welcome! (Theoretical ones, too, but those might be saved for later fun.)
Some context
We are trying to classify a corpus of many thousands of websites in 3 to 5 languages (maybe up to 10, but we're not sure).
We have training data in the form of hundreds of websites already classified. However, we may choose to use that data set or not -- if other categories make more sense, we're open to not using the training data that we have, since it is not something we gathered in the first place. We are on the final stages of scraping data/text from websites.
Now we must decide on the issues above. I have done some work with the Brown Corpus and the Brill tagger, but this will not work because of the multiple-languages issue.
We intend to use the Orange machine learning package.
According to the context you have provided, this is a supervised learning problem.
Therefore, you are doing classification, not clustering. If I misunderstood, please update your question to say so.
I would start with the simplest features, namely tokenize the unicode text of the pages, and use a dictionary to translate every new token to a number, and simply consider the existence of a token as a feature.
Next, I would use the simplest algorithm I can - I tend to go with Naive Bayes, but if you have an easy way to run SVM this is also nice.
Compare your results with some baseline - say assigning the most frequent class to all the pages.
Is the simplest approach good enough? If not, start iterating over algorithms and features.
If you go the supervised route, then the fact that the web pages are in multiple languages shouldn't make a difference. If you go with, say lexical features (bag-o'-words style) then each language will end up yielding disjoint sets of features, but that's okay. All of the standard algorithms will likely give comparable results, so just pick one and go with it. I agree with Yuval that Naive Bayes is a good place to start, and only if that doesn't meet your needs that try something like SVMs or random forests.
If you go the unsupervised route, though, the fact that the texts aren't all in the same language might be a big problem. Any reasonable clustering algorithm will first group the texts by language, and then within each language cluster by something like topic (if you're using content words as features). Whether that's a bug or a feature will depend entirely on why you want to classify these texts. If the point is to group documents by topic, irrespective of language, then it's no good. But if you're okay with having different categories for each language, then yeah, you've just got as many separate classification problems as you have languages.
If you do want a unified set of classes, then you'll need some way to link similar documents across languages. Are there any documents in more that one language? If so, you could use them as a kind of statistical Rosetta Stone, to link words in different languages. Then, using something like Latent Semantic Analysis, you could extend that to second-order relations: words in different languages that don't ever occur in the same document, but which tend to co-occur with words which do. Or maybe you could use something like anchor text or properties of the URLs to assign a rough classification to documents in a language-independent manner and use that as a way to get started.
But, honestly, it seems strange to go into a classification problem without a clear idea of what the classes are (or at least what would count as a good classification). Coming up with the classes is the hard part, and it's the part that'll determine whether the project is a success or failure. The actual algorithmic part is fairly rote.
Main answer is: try different approaches. Without actual testing it's very hard to predict what method will give best results. So, I'll just suggest some methods that I would try first and describe their pros and cons.
First of all, I would recommend supervised learning. Even if the data classification is not very accurate, it may still give better results than unsupervised clustering. One of the reasons for it is a number of random factors that are used during clustering. For example, k-means algorithm relies on randomly selected points when starting the process, which can lead to a very different results for different program runnings (though x-means modifications seems to normalize this behavior). Clustering will give good results only if underlying elements produce well separated areas in the feature space.
One of approaches to treating multilingual data is to use multilingual resources as support points. For example, you can index some Wikipedia's articles and create "bridges" between same topics in different languages. Alternatively, you can create multilingual association dictionary like this paper describes.
As for methods, the first thing that comes to mind is instance-based semantic methods like LSI. It uses vector space model to calculate distance between words and/or documents. In contrast to other methods it can efficiently treat synonymy and polysemy. Disadvantage of this method is a computational inefficiency and leak of implementations. One of the phases of LSI makes use of a very big cooccurrence matrix, which for large corpus of documents will require distributed computing and other special treatment. There's modification of LSA called Random Indexing which do not construct full coocurrence matrix, but you'll hardly find appropriate implementation for it. Some time ago I created library in Clojure for this method, but it is pre-alpha now, so I can't recommend using it. Nevertheless, if you decide to give it a try, you can find project 'Clinch' of a user 'faithlessfriend' on github (I'll not post direct link to avoid unnecessary advertisement).
Beyond special semantic methods the rule "simplicity first" must be used. From this point, Naive Bayes is a right point to start from. The only note here is that multinomial version of Naive Bayes is preferable: my experience tells that count of words really does matter.
SVM is a technique for classifying linearly separable data, and text data is almost always not linearly separable (at least several common words appear in any pair of documents). It doesn't mean, that SVM cannot be used for text classification - you still should try it, but results may be much lower than for other machine learning tasks.
I haven't enough experience with decision trees, but using it for efficient text classification seems strange to me. I have seen some examples where they gave excellent results, but when I tried to use C4.5 algorithm for this task, the results were terrible. I believe you should get some software where decision trees are implemented and test them by yourself. It is always better to know then to suggest.
There's much more to say on every topic, so feel free to ask more questions on specific topic.
I am running a physics simulation and applying a set of movement instructions to a simulated skeleton. I have a multiple sets of instructions for the skeleton consisting of force application to legs, arms, torso etc. and duration of force applied to their respective bone. Each set of instructions (behavior) is developed by testing its effectiveness performing the desired behavior, and then modifying the behavior with a genetic algorithm with other similar behaviors, and testing it again. The skeleton will have an array behaviors in its set list.
I have fitness functions which test for stability, speed, minimization of entropy and force on joints. The problem is that any given behavior will work for a specific context. One behavior works on flat ground, another works if there is a bump in front of the right foot, another if it's in front of the left, and so on. So the fitness of each behavior varies based on the context. Picking a behavior simply on its previous fitness level won't work because that fitness score doesn't apply to this context.
My question is, how do I program to have the skeleton pick the best behavior for the context? Such as picking the best walking behavior for a randomized bumpy terrain.
In a different answer I've given to this question, I assumed that the "terrain" information you have for your model was very approximate and large-grained, e.g., "smooth and flat", "rough", "rocky", etc. and perhaps only at a grid level. However, if the world model is in fact very detailed, such as from a simulated version of a 3-D laser range scanner, then algorithmic and computational path/motion planning approaches from robotics are likely to be more useful than a machine-learning classifier system.
PATH/MOTION PLANNING METHODS
There are a fairly large number of path and motion planning methods, including some perhaps more suited to walking/locomotion, but a few of the more general ones worth mentioning are:
Visibility graphs
Potential Fields
Sampling-based methods
The general solution approach would be use a path planning method to determine the walking trajectory that your skeleton should follow to avoid obstacles, and then use your GA-based controller to achieve the appropriate motion. This is very much at the core of robotics: sense the world and determine actions and motor control required to achieve some goal(s).
Also, a quick literature search turned up the following papers and a book as a source of ideas and starting points for further investigation. The paper on legged robot motion planning may be especially useful as it discusses several motion planning strategies.
Reading Suggestions
Steven Michael LaValle (2006). Planning Algorithms, Cambridge University Press.
Kris Hauser, Timothy Bretl, Jean-Claude Latombe, Kensuke Harada, Brian Wilcox (2008). "Motion Planning for Legged Robots on Varied Terrain", The International Journal of Robotics Research, Vol. 27, No. 11-12, 1325-1349,
DOI: 10.1177/0278364908098447
Guilherme N. DeSouza and Avinash C. Kak (2002). "Vision for Mobile Robot Navigation: A Survey", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 2, February, pp 237-267.
Why not test the behaviors against a randomized bumpy terrain? Just set the parameters of the GA so that it's a little forgiving, and won't condemn a behavior for one or two failures.
You have two problems:
Bipedal locomotion without senses is very difficult. I've seen good robotic locomotion over rough terrain without senses, but never with only two legs. So the best solution you can possibly find this way might not be very good.
Running a GA is as much art as science. There are a lot of knobs you can turn, and it's hard to find parameters that will allow novelty to grow without drowning it in noise.
Starting simple (e.g. crawling) will help with both of these.
EDIT:
Wait... you're training it over and over on the same randomized terrain? Well no wonder you're having trouble! It's optimizing for that particular layout of rocks and bumps, which is much easier than generalizing. Depending on how your GA works, you might get some benefit from making the course really long, but a better solution is to randomize the terrain for every pass. When it can no longer exploit specific features of the terrain, it will have an evolutionary incentive to generalize. Since this is a more difficult problem it will not learn as quickly as it did before, and it might not be able to get very good at all with its current parameters; be prepared to tinker.
There are three aspects to my answer: (1) control theory, (2) sensing, and (3) merging sensing and action.
CONTROL THEORY
The answer to your problem depends partially on what kind of control scheme you are using: is it feed-forward or feedback control? If the latter, what simulated real-time sensors do you have other than terrain information?
Simply having terrain information and incorporating it into your control strategy would not mean you are using feedback control. It is possible to use such information to select a feed-forward strategy, which seems closest to the problem that you have described.
SENSING
Whether you are using feed-forward or feedback control, you need to represent the terrain information and any other sensory data as an input space for your control system. Part of training your GA-based motion controller should be moving your skeleton through a broad range of random terrain in order to learn feature detectors. The feature detectors classify the terrain scenarios by segmenting the input space into regions critical to deciding what is the best action policy, i.e., what control behavior to employ.
How to best represent the input space depends on the level of granularity of the terrain information you have for your simulation. If it's just a discrete space of terrain type and/or obstacles in some grid space, you may be able to present it directly to your GA without transformation. If, however, the data is in a continuous space such as terrain type and obstacles at arbitrary range/direction, you may need to transform it into a space from which it may be easier to infer spatial relationships, such as coarse-coded range and direction, e.g., near, mid, far and forward, left-forward, left, etc. Gaussian and fuzzy classifiers can be useful for the latter approach, but discrete-valued coding can also work.
MERGING SENSING AND ACTION
Using one of the input-space-encoding approaches above, you have a few options for how to connect behavior selection search space and motion control search space:
Separate the two spaces into two learning problems and use a separate GA to evolve the parameters of a standard multi-layer perceptron neural network. The latter would have your sensor data (perhaps transformed) as inputs and your set of skeleton behaviors as outputs. Instead of using back-propagation or some other ANN-learning method to learn the network weights, your GA could use some fitness function to evolve the parameters over a series of simulated trials, e.g., fitness = distance traveled in a fixed time period toward point B starting from point A. This should evolve over successive generations from completely random selection of behaviors to something more coordinated and useful.
Merge the two search spaces (behavior selection and skeleton motor control) by linking a multi-layer perceptron network as described in (1) above into the existing GA-based controller framework that you have, using the skeleton behavior set as the linkage. The parameter space that will be evolved will be both the neural network weights and whatever your existing controller parameter space is. Assuming that you are using a multi-objective genetic algorithm, such as the NSGA-II algorithm, (since you have multiple fitness functions), the fitness functions would be stability, speed, minimization of entropy, force on joints, etc, plus some fitness function(s) targeted at learning the behavior-selection policy, e.g., distance moved toward point B starting from point A in a fixed time period.
The difference between this approach and (1) above is that you may be able to learn both better coordination of behaviors and finer-grain motor control since the parameter space is likely to be better explored when the two problems are merged as opposed to being separate. The downside is that it may take much longer to converge on reasonable parameter solutions(s), and not all aspects of motor control may be learned as well as they would if the two learning problems were kept separate.
Given that you already have working evolved solutions for the motor control problem, you are probably better off using approach (1) to learn the behavior-selection model with a separate GA. Also, there are many alternatives to the hybrid GA-ANN scheme I described above for learning the latter model, including not learning a model at all and instead using a path planning algorithm as described in a separate answer from me. I simply offered this approach since you are already familiar with GA-based machine learning.
The action selection problem is a robust area of research in both machine learning and autonomous robotics. It's probably well-worth reading up on this topic in itself to gain better perspective and insight into your current problem, and you may be able to devise a simpler strategy than anything I've suggested so far by viewing your problem through the lens of this paradigm.
You're using a genetic algorithm to modify the behavior, so that must mean you have devised a fitness function for each combination of factors. Is that your question?
If yes, the answer depends on what metrics you use to define best walking behavior:
Maximize stability
Maximize speed
Minimize forces on joints
Minimize energy or entropy production
Or do you just try a bunch of parameters, record the values, and then let the genetic algorithm drive you to the best solution?
If each behavior works well in one context and not another, I'd try quantifying how to sense and interpolate between contexts and blend the strategies to see if that would help.
It sounds like at this point you have just a classification problem. You want to map some knowledge about what you are currently walking on to one of a set of classes. Knowing the class of the terrain allows you to then invoke the proper subroutine. Is this correct?
If so, then there are a wide array of classification engines that you can use including neural networks, Bayesian networks, decision trees, nearest neighbor, etc. In order to pick the best fit, we will need more information about your problem.
First, what kind of input or sensory data do you have available to help you identify the behavior class you should invoke? Second, can you describe the circumstances in which you will be training this classifier and what the circumstances are during runtime when you deploy it, such as any limits on computational resources or requirements of robustness to noise?
EDIT: Since you have a fixed number of classes, and you have some parameterized model for generating all possible terrains, I would consider using k-means clustering. The principle is as follows. You cluster a whole bunch of terrains into k different classes, where each cluster is associated with one of your specialized subroutines that performs best for that cluster of terrains. Then when a new terrain comes in, it will probably fall near one of these clusters. You then invoke the corresponding specialized subroutine to navigate that terrain.
Do this offline: Generate enough random terrains to sufficiently sample the parameter space, map these terrains to your sensory space (but remember which points in sensory space correspond to which terrains), and then run k-means clustering on this sensory space corpus where k is the number of classes you want to learn. Your distance function between a class representative C and a point P in sensory space would be simply the fitness function of letting algorithm C navigate the terrain that generated P. You would then get a partitioning of your sensory space into k clusters, each cluster mapping to the best subroutine that you've got. Each cluster will have a representative point in sensory space.
Now during runtime: You will get some unlabeled point in sensory space. Use a different distance function to find the closest representative point to this new incoming point. That tells you what class the terrain is.
Note that the success of this method depends on the quality of the mapping from the parameter space of terrain generation to sensory space, from sensory space to your fitness functions, and the eventual distance function you use to compare points in sensory space.
Note also that if you had enough memory, instead of only using the k representative sensory points to tell you which class an unlabeled sensory point belongs to, you might go through your training set and label all points with the learned class. Then during runtime you pick the nearest neighbor, and conclude that your unlabeled point in sensory space is in the same class as that neighbor.
I want to embed 3 NP-Complete problems(2 of them are known to be NP-Complete, 1 of them is my own idea). I saw "this question" and got idea about reinterpreting embedding problems in theory:
The Waiter is The Thief.
Tables are store.
Foods are valued items which has different weight.
Thief know all the items' price and weight before the robbery.
His target is most efficient robbery(max capacity of knapsack used, most valued items got) with robbing(getting at least 1 item) all stores(shortest way to completing robbery tour, also visit each store 1 time).
This part is embedding 2 NP-Complete problems.
My idea is, that more items mean more bag weight. More bag weight slow downs the thief exponentially. So another target of the thief should be finishing the robbery as quickly as he/she can.
At this time, I'm not sure that my idea is actually NP-Complete. Maybe, "gravity" is not a NP-Complete Problem alone. But maybe it is in this context of the travelling salesman and knapsack problem.
So my questions are:
Is the slowing down of the thief NP-complete, too?
Is it possible to reduce those three embedded problems to a simple NP-complete problem?
Okay, that was just a bit tough to follow, but I think I'm getting the gist.
The XKCD cartoon is showing you how easy it is to make a real-life problem NP-complete. (Of course, since most menus have a small number of items and a uniform set of prices, it's also easy to show that there is a trivial answer.)
The idea of "embedding" an NP-complete problem I think you're referring to is finding a poly-time reduction; I've written that up pretty completely elsewhere on SO.
This is a bit confusing, but here are my answers to some possible questions.
The combination of two NP-complete problems is going to be NP-complete. In fact, the combination of an NP-complete problem with any other problem is going to be NP-complete.
I don't see how to evaluate whether the gravity problem is NP-complete on its own, because it isn't on its own. If the time between points depends on distance as well as backpack weight, then it's NP-complete because it's part of the Traveling Salesman problem. If it doesn't, then the right solution is to pick up objects lightest to heaviest.
The combined problem is a combination of two problems (which objects to steal, and which route to take), and doesn't look any more interesting to me than the two separately, since you can solve one without worrying about the other. Adding weight-dependent delays can couple the problems so they aren't independent, but you need an evaluation function other than how fast you can commit the optimum theft (the optimum theft is its own problem, and then it's just a modified TSP).
Nor are you going to be able to take problems, couple them, complicate them, and then make a simpler problem in general.
I honestly have no idea what you are asking. But it might be you are asking how you prove one problem NP-complete by converting it to another NP-complete problem.
The answer to this is that you write an algorithm which runs in polynomial time to convert the problem to a known NP-complete problem, then another polynomial algorithm to convert the solution back.
For more details read a decent textbook or see the Wikipedia page.
Thanks for helpful comments, Cartoon gave me an idea about embedding problems. I was bit hurry when I was writing, so I did many writing mistakes. My main language is not English too, so editors made my question more understandable. I also want another comments to more brainstorming.
Charlie Martin, thanks for your link.
The cost of carrying extra weight is not a problem by itself, but rather a parameterization of the edge-weights of your Traveling Salesman Problem.
The decision version of this problem is still NP-complete, because a) we can still check quickly if a given tour has cost less than k so it is in NP, and b) Hamiltonian Cycle still reduces to our TSP with carrying costs (since we just set all edge weights to 1 and all carrying costs to 0 in the reduction).
In other words, the carrying costs just made our TSP harder, so it is still NP-hard (and can be used to solve any NP-complete problem), but it did not make it hard enough that we cannot quickly check a proposed solution to the decision problem: "Does this tour cost less than c?", so it is still NP-complete.
The (NP-complete) decision version of the Knapsack problem is independent, and can be solved sequentially with the TSP problem.
So the entire problem is NP-complete, and if we reduce the TSP problem and the Knapsack
problems to SAT (reduction normally isn't done in this direction but it is theoretically possible), then we can encode the two together as one SAT instance.