Implementing Simulated Annealing - artificial-intelligence

I think I understand the basic concept of simulated annealing. It's basically adding random solutions to cover a better area of the search space at the beginning then slowly reducing the randomness as the algorithm continues running.
I'm a little confused on how I would implement this into my genetic algorithm.
Can anyone give me a simple explanation of what I need to do and clarify that my understand of how simulated annealing works is correct?

When constructing a new generation of individuals in a genetic algorithm, there are three random aspects to it:
Matching parent individuals to parent individuals, with preference according to their proportional fitness,
Choosing the crossover point, and,
Mutating the offspring.
There's not much you can do about the second one, since that's typically a uniform random distribution. You could conceivably try to add some random factor to the roulette wheel as you're selecting your parent individuals, and then slowly decrease that random function. But that goes against the spirit of the genetic algorithm and (more importantly) I don't think it will do much good. I think it would hurt, actually.
That leaves the third factor-- change the mutation rate from high mutation to low mutation as the generations go by.
It's really not any more complicated than that.

Related

What are the differences between Monte Carlo and Markov chains techniques?

I want to develop RISK board game, which will include an AI for computer players. Moreovor, I read two articles, this and this, about it, and I realised that I must learn about Monte Carlo simulation and Markov chains techniques. And I thought that I have to use these techniques together, but I guess they are different techniques relevant to calculate probabilities about transition states.
So, could anyone explain what are the important differences and advantages and disadvantages between them?
Finally, which way you will prefer if you would implement an AI for RISK game?
Here you can find simple determined probabilities about outcomes of a battle in risk board game, and the brute force algorithm used. There is a tree diagram to which specifies all possible states. Should I use Monte Carlo or Markov chain on this tree?
Okay, so I skimmed the articles to get a sense of what they were doing. Here is my overview of the terms you asked about:
A Markov Chain is simply a model of how your system moves from state to state. Developing a Markov model from scratch can sometimes be difficult, but once you have one in hand, they're relatively easy to use, and relatively easy to understand. The basic idea is that your game will have certain states associated with it; that as part of the game, you will move from state to state; and, critically, that this movement from state to state happens based on probability and that you know these probabilities.
Once you have that information, you can represent it all as a graph, where nodes are states and arcs between states (labelled with probabilities) are transitions. You can also represent as a matrix that satisfies certain constraints, or several other more exotic data structures.
The short article is actually all about the Markov Chain approach, but-- and this is important-- it is using that approach only as a quick means of estimating what will happen if army A attacks a territory with army B defending it. It's a nice use of the technique, but it is not an AI for Risk, it is merely a module in the AI helping to figure out the likely results of an attack.
Monte Carlo techniques, by contrast, are estimators. Once you have a model of something, whether a Markov model or any other, you often find yourself in the position of wanting to estimate something about it. (Often it happens to be something that you can, if you squint hard enough, put into the form of an integral of something that happens to be very hard to work with in that format.) Monte Carlo techniques just sample randomly and aggregate the results into an estimate of what's going to happen.
Monte Carlo techniques are not, in my opinion, AI techniques. They are a very general purpose technique that happen to be useful for AI, but they are not AI per se. (You can say the same thing about Markov models, but the statement is weaker because Markov models are so extremely useful for planning in AI, that entire philosophies of AI are built around the technique. Markov models are used elsewhere, though, as well.)
So that is what they are. You also asked which one I would use if I had to implement a Risk AI. Well, neither of those is going to be sufficient. Monte Carlo, as I said, is not an AI technique, it's a general math tool. And Markov Models, while they could in theory represent the entirety of a game of Risk, are going to end up being very unwieldy: You would need to represent every state of the game, meaning every possible configuration of armies in territories and every possible configuration of cards in hands, etc. (I am glossing over many details, here: There are a lot of other difficulties with this approach.)
The core of Wolf's thesis is neither the Markov approach nor the Monte Carlo approach, it is actually what he describes as the evaluation function. This is the heart of the AI problem: How to figure out what action is best. The Monte Carlo method in Blatt's paper describes a method to figure out what the expected result of an action is, but that is not the same as figuring out what action is best. Moreover, there's a low key statement in Wolf's thesis about look-ahead being hard to perform in Risk because the game trees become so very large, which is what led him (I think) to focus so much on the evaluation function.
So my real advice would be this: Read up on search-tree approaches, like minimax, alpha-beta pruning and especially expecti-minimax. You can find good treatments of these early in Russell and Norvig, or even on Wikipedia. Try to understand why these techniques work in general, but are cumbersome for Risk. That will lead you to some discussion of board evaluation techniques. Then go back and look at Wolf's thesis, focusing on his action evaluation function. And finally, focus on the way he tries to automatically learn a good evaluation function.
That is a lot of work. But Risk is not an easy game to develop an AI for.
(If you just want to figure out the expected results of a given attack, though, I'd say go for Monte Carlo. It's clean, very intuitive to understand, and very easy to implement. The only difficult-- and it's not a big one-- is making sure you run enough trials to get a good result.)
Markov chains are simply a set of transitions and their probabilities, assuming no memory of past events.
Monte Carlo simulations are repeated samplings of random walks over a set of probabilities.
You can use both together by using a Markov chain to model your probabilities and then a Monte Carlo simulation to examine the expected outcomes.
For Risk I don't think I would use Markov chains because I don't see an advantage. A Monte Carlo analysis of your possible moves could help you come up with a pretty good AI in conjunction with a suitable fitness function.
I know this glossed over a lot but I hope it helped.

How do I pick a good representation for a board game tactic for a genetic algorithm?

For my bachelor's thesis I want to write a genetic algorithm that learns to play the game of Stratego (if you don't know this game, it's probably safe to assume I said chess). I haven't ever before done actual AI projects, so it's an eye-opener to see how little I actually know of implementing things.
The thing I'm stuck with is coming up with a good representation for an actual strategy. I'm probably making some thinking error, but some problems I encounter:
I don't assume you would have a representation containing a lot of
transitions between board positions, since that would just be
bruteforcing it, right?
What could branches of a decision tree look
like? Any representation I come up with don't have interchangeable
branches... If I were to use a bit string, which is apparently also
common, what would the bits represent?
Do I assign scores to the distance between certain pieces? How would I represent that?
I think I ought to know these things after three+ years of study, so I feel pretty stupid - this must look likeI have no clue at all. Still, any help or tips on what to Google would be appreciated!
I think, you could define a decision model and then try to optimize the parameters of that model. You can create multi-stage decision models also. I once did something similar for solving a dynamic dial-a-ride problem (paper here) by modeling it as a two stage linear decision problem. To give you an example, you could:
For each of your figures decide which one is to move next. Each figure is characterized by certain features derived from its position on the board, e.g. ability to make a score, danger, protecting x other figures, and so on. Each of these features can be combined (e.g. in a linear model, through a neural network, through a symbolic expression tree, a decision tree, ...) and give you a rank on which figure to act next with.
Acting with the figure you selected. Again there are a certain number of actions that can be taken, each has certain features. Again you can combine and rank them and one action will have the highest priority. This is the one you choose to perform.
The features you extract can be very simple or insanely complex, it's up to what you think will work best vs what takes how long to compute.
To evaluate and improve the quality of your decision model you can then simulate these decisions in several games against opponents and train the parameters of the model that combines these features to rank the moves (e.g. using a GA). This way you tune the model to win as many games as possible against the specified opponents. You can test the generality of that model by playing against opponents it has not seen before.
As Mathew Hall just said, you can use GP for this (if your model is a complex rule), but this is just one kind of model. In my case a linear combination of the weights did very well.
Btw, if you're interested we've also got a software on heuristic optimization which provides you with GA, GP and that stuff. It's called HeuristicLab. It's GPL and open source, but comes with a GUI (Windows). We've some Howto on how to evaluate the fitness function in an external program (data exchange using protocol buffers), so you can work on your simulation and your decision model and let the algorithms present in HeuristicLab optimize your parameters.
Vincent,
First, don't feel stupid. You've been (I infer) studying basic computer science for three years; now you're applying those basic techniques to something pretty specialized-- a particular application (Stratego) in a narrow field (artificial intelligence.)
Second, make sure your advisor fully understands the rules of Stratego. Stratego is played on a larger board, with more pieces (and more types of pieces) than chess. This gives it a vastly larger space of legal positions, and a vastly larger space of legal moves. It is also a game of hidden information, increasing the difficulty yet again. Your advisor may want to limit the scope of the project, e.g., concentrate on a variant with full observation. I don't know why you think this is simpler, except that the moves of the pieces are a little simpler.
Third, I think the right thing to do at first is to take a look at how games in general are handled in the field of AI. Russell and Norvig, chapters 3 (for general background) and 5 (for two player games) are pretty accessible and well-written. You'll see two basic ideas: One, that you're basically performing a huge search in a tree looking for a win, and two, that for any non-trivial game, the trees are too large, so you search to a certain depth and then cop out with a "board evaluation function" and look for one of those. I think your third bullet point is in this vein.
The board evaluation function is the magic, and probably a good candidate for using either a genetic algorithm, or a genetic program, either of which might be used in conjunction with a neural network. The basic idea is that you are trying to design (or evolve, actually) a function that takes as input a board position, and outputs a single number. Large numbers correspond to strong positions, and small numbers to weak positions. There is a famous paper by Chellapilla and Fogel showing how to do this for a game of Checkers:
http://library.natural-selection.com/Library/1999/Evolving_NN_Checkers.pdf
I think that's a great paper, tying three great strands of AI together: Adversarial search, genetic algorithms, and neural networks. It should give you some inspiration about how to represent your board, how to think about board evaluations, etc.
Be warned, though, that what you're trying to do is substantially more complex than Chellapilla and Fogel's work. That's okay-- it's 13 years later, after all, and you'll be at this for a while. You're still going to have a problem representing the board, because the AI player has imperfect knowledge of its opponent's state; initially, nothing is known but positions, but eventually as pieces are eliminated in conflict, one can start using First Order Logic or related techniques to start narrowing down individual pieces, and possibly even probabilistic methods to infer information about the whole set. (Some of these may be beyond the scope of an undergrad project.)
The fact you are having problems coming up with a representation for an actual strategy is not that surprising. In fact I would argue that it is the most challenging part of what you are attempting. Unfortunately, I haven't heard of Stratego so being a bit lazy I am going to assume you said chess.
The trouble is that a chess strategy is rather a complex thing. You suggest in your answer containing lots of transitions between board positions in the GA, but a chess board has more possible positions than the number of atoms in the universe this is clearly not going to work very well. What you will likely need to do is encode in the GA a series of weights/parameters that are attached to something that takes in the board position and fires out a move, I believe this is what you are hinting at in your second suggestion.
Probably the simplest suggestion would be to use some sort of generic function approximation like a neural network; Perceptrons or Radial Basis Functions are two possibilities. You can encode weights for the various nodes into the GA, although there are other fairly sound ways to train a neural network, see Backpropagation. You could perhaps encode the network structure instead/as well, this also has the advantage that I am pretty sure a fair amount of research has been done into developing neural networks with a genetic algorithm so you wouldn't be starting completely from scratch.
You still need to come up with how you are going to present the board to the neural network and interpret the result from it. Especially, with chess you would have to take note that a lot of moves will be illegal. It would be very beneficial if you could encode the board and interpret the result such that only legal moves are presented. I would suggest implementing the mechanics of the system and then playing around with different board representations to see what gives good results. A few ideas top of the head ideas to get you started could be, although I am not really convinced any of them are especially great ways to do this:
A bit string with all 64 squares one after another with a number presenting what is present in each square. Most obvious, but probably a rather bad representation as a lot of work will be required to filter out illegal moves.
A bit string with all 64 squares one after another with a number presenting what can move to each square. This has the advantage of embodying the covering concept of chess where you what to gain as much coverage of the board with your pieces as possible, but still has problems with illegal moves and dealing with friendly/enemy pieces.
A bit string with all 32 pieces one after another with a number presenting the location of that piece in each square.
In general though I would suggest that chess is rather a complex game to start with, I think it will be rather hard to get something playing to standard which is noticeably better than random. I don't know if Stratego is any simpler, but I would strongly suggest you opt for a fairly simple game. This will let you focus on getting the mechanics of the implementation correct and the representation of the game state.
Anyway hope that is of some help to you.
EDIT: As a quick addition it is worth looking into how standard chess AI's work, I believe most use some sort of Minimax system.
When you say "tactic", do you mean you want the GA to give you a general algorithm to play the game (i.e. evolve an AI) or do you want the game to use a GA to search the space of possible moves to generate a move at each turn?
If you want to do the former, then look into using Genetic programming (GP). You could try to use it to produce the best AI you can for a fixed tree size. JGAP already comes with support for GP as well. See the JGAP Robocode example for an instance of this. This approach does mean you need a domain specific language for a Stratego AI, so you'll need to think carefully how you expose the board and pieces to it.
Using GP means your fitness function can just be how well the AI does at a fixed number of pre-programmed games, but that requires a good AI player to start with (or a very patient human).
#DonAndre's answer is absolutely correct for movement. In general, problems involving state-based decisions are hard to model with GAs, requiring some form of GP (either explicit or, as #DonAndre suggested, trees that are essentially declarative programs).
A general Stratego player seems to me quite challenging, but if you have a reasonable Stratego playing program, "Setting up your Stratego board" would be an excellent GA problem. The initial positions of your pieces would be the phenotype and the outcome of the external Stratego-playing code would be the fitness. It is intuitively likely that random setups would be disadvantaged versus setups that have a few "good ideas" and that small "good ideas" could be combined into fitter-and-fitter setups.
...
On the general problem of what a decision tree, even trying to come up with a simple example, I kept finding it hard to come up with a small enough example, but maybe in the case where you are evaluation whether to attack a same-ranked piece (which, IIRC destroys both you and the other piece?):
double locationNeed = aVeryComplexDecisionTree();
if(thatRank == thisRank){
double sacrificeWillingness = SACRIFICE_GENETIC_BASE; //Assume range 0.0 - 1.0
double sacrificeNeed = anotherComplexTree(); //0.0 - 1.0
double sacrificeInContext = sacrificeNeed * SACRIFICE_NEED_GENETIC_DISCOUNT; //0.0 - 1.0
if(sacrificeInContext > sacrificeNeed){
...OK, this piece is "willing" to sacrifice itself
One way or the other, the basic idea is that you'd still have a lot of coding of Stratego-play, you'd just be seeking places where you could insert parameters that would change the outcome. Here I had the idea of a "base" disposition to sacrifice itself (presumably higher in common pieces) and a "discount" genetically-determined parameter that would weight whether the piece would "accept or reject" the need for a sacrifice.

Best way to automate testing of AI algorithms?

I'm wondering how people test artificial intelligence algorithms in an automated fashion.
One example would be for the Turing Test - say there were a number of submissions for a contest. Is there any conceivable way to score candidates in an automated fashion - other than just having humans test them out.
I've also seen some data sets (obscured images of numbers/letters, groups of photos, etc) that can be fed in and learned over time. What good resources are out there for this.
One challenge I see: you don't want an algorithm that tailors itself to the test data over time, since you are trying to see how well it does in the general case. Are there any techniques to ensure it doesn't do this? Such as giving it a random test each time, or averaging its results over a bunch of random tests.
Basically, given a bunch of algorithms, I want some automated process to feed it data and see how well it "learned" it or can predict new stuff it hasn't seen yet.
This is a complex topic - good AI algorithms are generally the ones which can generalize well to "unseen" data. The simplest method is to have two datasets: a training set and an evaluation set used for measuring the performances. But generally, you want to "tune" your algorithm so you may want 3 datasets, one for learning, one for tuning, and one for evaluation. What defines tuning depends on your algorithm, but a typical example is a model where you have a few hyper-parameters (for example parameters in your Bayesian prior under the Bayesian view of learning) that you would like to tune on a separate dataset. The learning procedure would already have set a value for it (or maybe you hardcoded their value), but having enough data may help so that you can tune them separately.
As for making those separate datasets, there are many ways to do so, for example by dividing the data you have available into subsets used for different purposes. There is a tradeoff to be made because you want as much data as possible for training, but you want enough data for evaluation too (assuming you are in the design phase of your new algorithm/product).
A standard method to do so in a systematic way from a known dataset is cross validation.
Generally when it comes to this sort of thing you have two datasets - one large "training set" which you use to build and tune the algorithm, and a separate smaller "probe set" that you use to evaluate its performance.
#Anon has the right of things - training and what I'll call validation sets. That noted, the bits and pieces I see about developments in this field point at two things:
Bayesian Classifiers: there's something like this probably filtering your email. In short you train the algorithm to make a probabilistic decision if a particular item is part of a group or not (e.g. spam and ham).
Multiple Classifiers: this is the approach that the winning group involved in the Netflix challenge took, whereby it's not about optimizing one particular algorithm (e.g. Bayesian, Genetic Programming, Neural Networks, etc..) by combining several to get a better result.
As for data sets Weka has several available. I haven't explored other libraries for data sets, but mloss.org appears to be a good resource. Finally data.gov offers a lot of sets that provide some interesting opportunities.
Training data sets and test sets are very common for K-means and other clustering algorithms, but to have something that's artificially intelligent without supervised learning (which means having a training set) you are building a "brain" so-to-speak based on:
In chess: all possible future states possible from the current gameState.
In most AI-learning (reinforcement learning) you have a problem where the "agent" is trained by doing the game over and over. Basically you ascribe a value to every state. Then you assign an expected value of each possible action at a state.
So say you have S states and a actions per state (although you might have more possible moves in one state, and not as many in another), then you want to figure out the most-valuable states from s to be in, and the most valuable actions to take.
In order to figure out the value of states and their corresponding actions, you have to iterate the game through. Probabilistically, a certain sequence of states will lead to victory or defeat, and basically you learn which states lead to failure and are "bad states". You also learn which ones are more likely to lead to victory, and these are subsequently "good" states. They each get a mathematical value associated, usually as an expected reward.
Reward from second-last state to a winning state: +10
Reward if entering a losing state: -10
So the states that give negative rewards then give negative rewards backwards, to the state that called the second-last state, and then the state that called the third-last state and so-on.
Eventually, you have a mapping of expected reward based on which state you're in, and based on which action you take. You eventually find the "optimal" sequence of steps to take. This is often referred to as an optimal policy.
It is true of the converse that normal courses of actions that you are stepping-through while deriving the optimal policy are called simply policies and you are always implementing a certain "policy" with respect to Q-Learning.
Usually the way of determining the reward is the interesting part. Suppose I reward you for each state-transition that does not lead to failure. Then the value of walking all the states until I terminated is however many increments I made, however many state transitions I had.
If certain states are extremely unvaluable, then loss is easy to avoid because almost all bad states are avoided.
However, you don't want to discourage discovery of new, potentially more-efficient paths that don't follow just this-one-works, so you want to reward and punish the agent in such a way as to ensure "victory" or "keeping the pole balanced" or whatever as long as possible, but you don't want to be stuck at local maxima and minima for efficiency if failure is too painful, so no new, unexplored routes will be tried. (Although there are many approaches in addition to this one).
So when you ask "how do you test AI algorithms" the best part is is that the testing itself is how many "algorithms" are constructed. The algorithm is designed to test a certain course-of-action (policy). It's much more complicated than
"turn left every half mile"
it's more like
"turn left every half mile if I have turned right 3 times and then turned left 2 times and had a quarter in my left pocket to pay fare... etc etc"
It's very precise.
So the testing is usually actually how the A.I. is being programmed. Most models are just probabilistic representations of what is probably good and probably bad. Calculating every possible state is easier for computers (we thought!) because they can focus on one task for very long periods of time and how much they remember is exactly how much RAM you have. However, we learn by affecting neurons in a probabilistic manner, which is why the memristor is such a great discovery -- it's just like a neuron!
You should look at Neural Networks, it's mindblowing. The first time I read about making a "brain" out of a matrix of fake-neuron synaptic connections... A brain that can "remember" basically rocked my universe.
A.I. research is mostly probabilistic because we don't know how to make "thinking" we just know how to imitate our own inner learning process of try, try again.

What are the differences between simulated annealing and genetic algorithms?

What are the relevant differences, in terms of performance and use cases, between simulated annealing (with bean search) and genetic algorithms?
I know that SA can be thought as GA where the population size is only one, but I don't know the key difference between the two.
Also, I am trying to think of a situation where SA will outperform GA or GA will outperform SA. Just one simple example which will help me understand will be enough.
Well strictly speaking, these two things--simulated annealing (SA) and genetic algorithms are neither algorithms nor is their purpose 'data mining'.
Both are meta-heuristics--a couple of levels above 'algorithm' on the abstraction scale. In other words, both terms refer to high-level metaphors--one borrowed from metallurgy and the other from evolutionary biology. In the meta-heuristic taxonomy, SA is a single-state method and GA is a population method (in a sub-class along with PSO, ACO, et al, usually referred to as biologically-inspired meta-heuristics).
These two meta-heuristics are used to solve optimization problems, particularly (though not exclusively) in combinatorial optimization (aka constraint-satisfaction programming). Combinatorial optimization refers to optimization by selecting from among a set of discrete items--in other words, there is no continuous function to minimize. The knapsack problem, traveling salesman problem, cutting stock problem--are all combinatorial optimization problems.
The connection to data mining is that the core of many (most?) supervised Machine Learning (ML) algorithms is the solution of an optimization problem--(Multi-Layer Perceptron and Support Vector Machines for instance).
Any solution technique to solve cap problems, regardless of the algorithm, will consist essentially of these steps (which are typically coded as a single block within a recursive loop):
encode the domain-specific details
in a cost function (it's the
step-wise minimization of the value
returned from this function that
constitutes a 'solution' to the c/o
problem);
evaluate the cost function passing
in an initial 'guess' (to begin
iteration);
based on the value returned from the
cost function, generate a subsequent
candidate solution (or more than
one, depending on the
meta-heuristic) to the cost
function;
evaluate each candidate solution by
passing it in an argument set, to
the cost function;
repeat steps (iii) and (iv) until
either some convergence criterion is
satisfied or a maximum number of
iterations is reached.
Meta-heuristics are directed to step (iii) above; hence, SA and GA differ in how they generate candidate solutions for evaluation by the cost function. In other words, that's the place to look to understand how these two meta-heuristics differ.
Informally, the essence of an algorithm directed to solution of combinatorial optimization is how it handles a candidate solution whose value returned from the cost function is worse than the current best candidate solution (the one that returns the lowest value from the cost function). The simplest way for an optimization algorithm to handle such a candidate solution is to reject it outright--that's what the hill climbing algorithm does. But by doing this, simple hill climbing will always miss a better solution separated from the current solution by a hill. Put another way, a sophisticated optimization algorithm has to include a technique for (temporarily) accepting a candidate solution worse than (i.e., uphill from) the current best solution because an even better solution than the current one might lie along a path through that worse solution.
So how do SA and GA generate candidate solutions?
The essence of SA is usually expressed in terms of the probability that a higher-cost candidate solution will be accepted (the entire expression inside the double parenthesis is an exponent:
p = e((-highCost - lowCost)/temperature)
Or in python:
p = pow(math.e, (-hiCost - loCost) / T)
The 'temperature' term is a variable whose value decays during progress of the optimization--and therefore, the probability that SA will accept a worse solution decreases as iteration number increases.
Put another way, when the algorithm begins iterating, T is very large, which as you can see, causes the algorithm to move to every newly created candidate solution, whether better or worse than the current best solution--i.e., it is doing a random walk in the solution space. As iteration number increases (i.e., as the temperature cools) the algorithm's search of the solution space becomes less permissive, until at T = 0, the behavior is identical to a simple hill-climbing algorithm (i.e., only solutions better than the current best solution are accepted).
Genetic Algorithms are very different. For one thing--and this is a big thing--it generates not a single candidate solution but an entire 'population of them'. It works like this: GA calls the cost function on each member (candidate solution) of the population. It then ranks them, from best to worse, ordered by the value returned from the cost function ('best' has the lowest value). From these ranked values (and their corresponding candidate solutions) the next population is created. New members of the population are created in essentially one of three ways. The first is usually referred to as 'elitism' and in practice usually refers to just taking the highest ranked candidate solutions and passing them straight through--unmodified--to the next generation. The other two ways that new members of the population are usually referred to as 'mutation' and 'crossover'. Mutation usually involves a change in one element in a candidate solution vector from the current population to create a solution vector in the new population, e.g., [4, 5, 1, 0, 2] => [4, 5, 2, 0, 2]. The result of the crossover operation is like what would happen if vectors could have sex--i.e., a new child vector whose elements are comprised of some from each of two parents.
So those are the algorithmic differences between GA and SA. What about the differences in performance?
In practice: (my observations are limited to combinatorial optimization problems) GA nearly always beats SA (returns a lower 'best' return value from the cost function--ie, a value close to the solution space's global minimum), but at a higher computation cost. As far as i am aware, the textbooks and technical publications recite the same conclusion on resolution.
but here's the thing: GA is inherently parallelizable; what's more, it's trivial to do so because the individual "search agents" comprising each population do not need to exchange messages--ie, they work independently of each other. Obviously that means GA computation can be distributed, which means in practice, you can get much better results (closer to the global minimum) and better performance (execution speed).
In what circumstances might SA outperform GA? The general scenario i think would be those optimization problems having a small solution space so that the result from SA and GA are practically the same, yet the execution context (e.g., hundreds of similar problems run in batch mode) favors the faster algorithm (which should always be SA).
It is really difficult to compare the two since they were inspired from different domains..
A Genetic Algorithm maintains a population of possible solutions, and at each step, selects pairs of possible solution, combines them (crossover), and applies some random changes (mutation). The algorithm is based the idea of "survival of the fittest" where the selection process is done according to a fitness criteria (usually in optimization problems it is simply the value of the objective function evaluated using the current solution). The crossover is done in hope that two good solutions, when combined, might give even better solution.
On the other hand, Simulated Annealing only tracks one solution in the space of possible solutions, and at each iteration considers whether to move to a neighboring solution or stay in the current one according to some probabilities (which decays over time). This is different from a heuristic search (say greedy search) in that it doesn't suffer from the problems of local optimum since it can get unstuck from cases where all neighboring solutions are worst the current one.
I'm far from an expert on these algorithms, but I'll try and help out.
I think the biggest difference between the two is the idea of crossover in GA and so any example of a learning task that is better suited to GA than SA is going to hinge on what crossover means in that situation and how it is implemented.
The idea of crossover is that you can meaningfully combine two solutions to produce a better one. I think this only makes sense if the solutions to a problem are structured in some way. I could imagine, for example, in multi-class classification taking two (or many) classifiers that are good at classifying a particular class and combining them by voting to make a much better classifier. Another example might be Genetic Programming, where the solution can be expressed as a tree, but I find it hard to come up with a good example where you could combine two programs to create a better one.
I think it's difficult to come up with a compelling case for one over the other because they really are quite similar algorithms, perhaps having been developed from very different starting points.

Selecting an best target algorithm in arcade/strategy game AI programming

I would just like to know the various AI algorithms or logics used in arcade/strategy games for finding/selecting best target to attack for individual unit.
Because, I had to write an small AI logic, where their will be group of unit were attacked by an various tankers, so i am stuck in getting the better logic or algorithm for selecting an best target for unit to attack onto the tankers.
Data available are:
Tanker position, range, hitpoints, damage.
please anybody know the best suitable algorithm/logic for solving this problem, respond early.
Thanks in advance,
Ramanand.
I'm going to express this in a perspective similar to RPG gamers:
What character would you bring down first in order to strike a crippling blow to the rest of your enemies? It would be common sense to bring down the healers of the party, as they can heal the rest of the team. Once the healers are gone, the team needs to use medicine - which is limited in supply - and once medicine is exhausted, the party is screwed.
Similar logic would apply to the tank program. In your AI, you need to figure out which tanks provide the most strength and support to the user's fleet, and eliminate them first. Don't focus on any other tanks unless they become critical in achieving their goal: Kill the strongest, most useful members of the group first.
So I'm going to break down what I feel is most likely pertains to the attributes of your tanks.
RANGE: Far range tanks can hit from a distance but have weak STRENGTH in their attacks.
TANKER POSITION: Closer tanks are faster tanks, but have less STRENGTH in their attacks. Also low HITPOINTS because they're meant for SPEED, and not for DAMAGE.
TANKER HP: Higher HP means a slower-moving tank, as they're stronger. But they won't be close to the front lines.
DAMAGE: Higher DAMAGE means a STRONGER tank with lots of HP, but SLOWER as well to move.
So if I were you, I'd focus first on the tanks that have the highest HP/strongest attacks, followed by the closest ones, and then worry about the ranged tanks - you can't do anything to them anyway until they move into your attack radius :P
And the algorithm would be pretty simple. if you have a list of tanks in a party, create a custom sort for them (using CompareTo) and sort the tanks by class with the highest possible HP to the top of the list, followed by tanks with their focus being speed, and then range.
And then go through each item in the list. If it is possible to attack Tank(0), attack. If not, go to Tank(1).
The goal is to attack only one opponent at a time and receive fire from at most one enemy at a time (though, preferably, none).
Ideally, you would attack the tanks by remaining behind cover and flanking them with surprise attacks. This allows you to destroy the tanks one at a time, while receiving no or little fire.
If you don't have cover, then you should use the enemy as cover. Move into a position that puts the enemy behind the enemy. This also improves your chance to hit.
You can also use range to reduce fire from multiple enemies. Retreat until you are only within range of one enemy.
If the enemies can all fire on you, you want to attack one target until it is no longer a threat, then move on to the next target. The goal is to reduce the amount of fire that you receive as quickly as possible.
If more than one enemy can fire on you at the same time, and you can choose your target, you should fire at the one that allows you to reduce the most amount of damage for the least cost. Simply divide the hit points by the damage, and attack the one with the smallest result. You should also figure in any other relevant stats. Range probably affects you and the enemy equally, but considering the ability to maneuver out of the way of fire, closer enemies are more harmful and should be given some weight in the calculation.
If moving decreases the likelihood of being hit, then you should keep moving, typically by circling your opponent to stay at their flank.
Team tactics would mostly include flanking and diversions.
What's the ammo situation, and is it possible to miss a stationary target?
Based on your comments it sounds like you already have some adhoc set of rules or heuristics to give you something around 70% success based on your own measures, and you want to optimize this further to get a higher win rate.
As a general solution method I would use a hill-climbing algorithm. Since I don't know the details of your current algorithm that is responsible for the 70% success rate, I can only describe in abstract terms how to adapt hill-climbing to optimize your algorithm.
The general principle of hill-climbing is as follows. Hopefully, a small change in some numeric parameter of your current algorithm would be responsible for a small (hopefully linear) change in the resulting success rate. If this is true then you would first parameterize your current set of rules -- meaning you must decide in your current algorithm which numeric parameters may be tweaked and optimized to achieve a higher success rate. Once you've decided what they are, the learning process is straight-forward. Start with your current algorithm. Generate a variety of new algorithms with slightly tweaked parameters than before, and run your simulations to evaluate the performance of this new set of algorithms. Pick the best one as your next starting point. Repeat this process until the algorithm can't get any better.
If your algorithm is a set of if-then rules (this includes rule-matching systems), and improving the performance involves reordering or restructuring those rules, then you may want to consider genetic algorithms, which is a little more complex. To apply genetic algorithms, it is essential that you define the mutation and crossover operators such that a single application of mutation or crossover results in a small change in the overall performance while a many applications of mutation and crossover results in a large change in the overall performance of your algorithm. I'm not an expert in this field but there should be much that comes up when you google for "genetic algorithms on decision trees". The pitfall to avoid is that if you simply consider swapping branches in a decision tree for the mutation operator, a single application might modify the root of your decision tree, generating a huge performance difference. This typically adds too much noise for a genetic algorithm, so my advice in this approach is to be very careful about the encoding of your operators.
Note that these two methods are very popular AI methods for learning or improving your current algorithm. You would do all of these simulations and learning offline. Then you would simply deploy the resulting, learned algorithm.

Resources