Related
I cannot realy say why, but once YouTube suggested a video about an Genetic Alogirthm to me, well it really flashed me, someone made the google chrome no internet jump&run play alone by an learning AI.
Well since i'm programing plugins for Minecraft i got an idea to make an PvE Based Gamemode with an self learning AI (The Genetic Alogrithm), but right now i'm confused where to start, i can make the Fitness depended on the Kills of the Zombie, or on the damage dealt, but i dont know how i can reproduce this again, somehow i have to control the movement, the shots and so on with the AI, and i got no clue how to do that, i hope someone can help me, and you understand my question.
I think what you're trying to do is far more complex than you might think.
If you really want to train autonomous AI for zombies, you're going to need neural networks. But I think this is far too complex for a PvE game.
If you don't want to use neural networks, you have to set up a handful of parameters that define how the zombie acts, like:
Damage
Speed
Health
But it's illogical to use a genetic algorithm for this - you already know that maxing out these values will return the best zombie, so you might need to create more distinct parameters like:
Speed after hitting a player
Potion effect after hitting a player
If you want to stay to the 3 points named above, then you should create a maximum value - and make the genetic algorithm find the optimal distribution of this value.
That's the main part sorted out, then you want to get started on the genetic algorithm
Generation, generate zombies with random properties
Evaluation, let the zombies play a game, determine their fitness on: damage dealt, kills made, distance travelled
Selection, select the individuals ripe for crossover
Crossover, create offspring
Mutation, modify some values with a chance of x
I'm quite interested in your project. I advise you to start training some zombies on a local server, and then use these trained zombies as a base for the online version - so the first waves of zombies aren't too easy :)
With regards to your comment:
Actually i want to improve the movement and fight skills ov Zombies, meqans that they go back when they attack delay is colding down when enemys are really defensive and so on, and the zombies try to catch some single players when they play aggressive etc, but not sure how to do something like this, i dont know how to control movement with an AI, and when to attack etc, I know its a lot to do, but i'm realy interested in this.
This definitely requires a neural network. A neural network can have x inputs, these must all be environment variables, like:
distance nearest player
speed nearest player
health nearest player
etc. nearest player
its own health
And will compute outputs, which could be:
movement direction
movement speed
hit (true/false)
And you have to evolve the neural network through neuroevolution. You can definitely do this, but heads up; it's hard. Especially with a lot of environment variables.
But read some articles on neural networks, then read some articles on genetic algorithms. Then implement neuroevolution, for example through NeuroEvolution of Augmenting Topologies
I suggest you do some research on genetic algorithms, it looks like you're trying to run before you've learned to walk.
Ideally, if you want the AI to learn how to move, shoot, and other activities, you need to create a fitness function that can score based on all of these things. You then need to figure out at what point you're going to evolve/mutate/mate your AI/s, the product of this should start with the initial score of 0, as you will need to rescore the AI, as there is a possibility it could have taken a step backwards, rather than forwards.
I want to develop a two player game with imperfect information - "Stratego".
The game is "somewhat" like chess but initially we don't know anything about the ranks of the opponent's pieces. When a piece attacks or is attacked by some opponent's piece, their ranks are revealed and the higher rank piece kills/captures the lower rank piece.
More detail on the game can be found here.
I did a little research. I read "Opponent Modeling in Stratego" by J.A. Stankiewicz. But I couldn't find a complete tutorial on how to develop the game. I have successfully developed before a two player game - "Othello" a.k.a. Reversi, and I'm familiar with MINIMAX algorithm and alpha-beta pruning.
I found somewhere that Monte-Carlo Tree Search is also used in developing zero-sum two player games. Can it be used for games like stratego? Can I get a complete tutorial for the same?
Any other tutorial not involving Monte-Carlo Tree Search would also be useful :)
I think MCTS would have a difficult time in Stratego since the initial spreading function is so large while the best play is very dependent on the ground-truth of the game. That is to say, MCTS would, in the best case, give you a play that's statistically good amongst all the possible variations of your opponent's pieces, but the best next move is highly dependent on which particular variation they've chosen.
I'm still developing a solid understanding of MCTS, but it seems to me that MCTS does not do well in games where multi-round deceptive play involving hidden information is important (poker, canonically, but stratego, I would say, also). In such games, you really need to develop a model of the other player(s) situation/strategy and MCTS by its nature is going to give you an answer that is statistically related to all trees, not just the ground-truth tree.
MCTS works fine with games involving large amounts of chance (backgammon and other board games involving dice and many card games) and seems to me an excellent general-purpose solution that could be rapidly adopted to a large number of modern "European-style" board games. (The interesting thing with those is that although they involve "deceptive strategy" they generally involve relatively little hidden information.)
I don't know of any MCTS for incomplete information off the top of my had, and it seems like it would take substantial modification to the algorithm to get it to work.
Even in a very restricted type of Stratego where there are only ten pieces on each side, only two types of piece, and only one of the "stronger" piece, you're still playing one of ten possible actual games. In a full game of stratego there is far more uncertainty than that because of the large number of combinations of starting position, which all look alike.
It seems like you would also have to augment the algorithm to capture "revealed knowledge," as it happens, e.g., in our toy example, every encounter between pieces reveals some information about the enemy position.
It seems like it would be interesting to try, but only for a very restricted Stratego-like problem at first, and with the understanding that off the shelf MCTS is not sufficient and that you'd have to think carefully and deeply about the right extensions to the algorithm.
For my bachelor's thesis I want to write a genetic algorithm that learns to play the game of Stratego (if you don't know this game, it's probably safe to assume I said chess). I haven't ever before done actual AI projects, so it's an eye-opener to see how little I actually know of implementing things.
The thing I'm stuck with is coming up with a good representation for an actual strategy. I'm probably making some thinking error, but some problems I encounter:
I don't assume you would have a representation containing a lot of
transitions between board positions, since that would just be
bruteforcing it, right?
What could branches of a decision tree look
like? Any representation I come up with don't have interchangeable
branches... If I were to use a bit string, which is apparently also
common, what would the bits represent?
Do I assign scores to the distance between certain pieces? How would I represent that?
I think I ought to know these things after three+ years of study, so I feel pretty stupid - this must look likeI have no clue at all. Still, any help or tips on what to Google would be appreciated!
I think, you could define a decision model and then try to optimize the parameters of that model. You can create multi-stage decision models also. I once did something similar for solving a dynamic dial-a-ride problem (paper here) by modeling it as a two stage linear decision problem. To give you an example, you could:
For each of your figures decide which one is to move next. Each figure is characterized by certain features derived from its position on the board, e.g. ability to make a score, danger, protecting x other figures, and so on. Each of these features can be combined (e.g. in a linear model, through a neural network, through a symbolic expression tree, a decision tree, ...) and give you a rank on which figure to act next with.
Acting with the figure you selected. Again there are a certain number of actions that can be taken, each has certain features. Again you can combine and rank them and one action will have the highest priority. This is the one you choose to perform.
The features you extract can be very simple or insanely complex, it's up to what you think will work best vs what takes how long to compute.
To evaluate and improve the quality of your decision model you can then simulate these decisions in several games against opponents and train the parameters of the model that combines these features to rank the moves (e.g. using a GA). This way you tune the model to win as many games as possible against the specified opponents. You can test the generality of that model by playing against opponents it has not seen before.
As Mathew Hall just said, you can use GP for this (if your model is a complex rule), but this is just one kind of model. In my case a linear combination of the weights did very well.
Btw, if you're interested we've also got a software on heuristic optimization which provides you with GA, GP and that stuff. It's called HeuristicLab. It's GPL and open source, but comes with a GUI (Windows). We've some Howto on how to evaluate the fitness function in an external program (data exchange using protocol buffers), so you can work on your simulation and your decision model and let the algorithms present in HeuristicLab optimize your parameters.
Vincent,
First, don't feel stupid. You've been (I infer) studying basic computer science for three years; now you're applying those basic techniques to something pretty specialized-- a particular application (Stratego) in a narrow field (artificial intelligence.)
Second, make sure your advisor fully understands the rules of Stratego. Stratego is played on a larger board, with more pieces (and more types of pieces) than chess. This gives it a vastly larger space of legal positions, and a vastly larger space of legal moves. It is also a game of hidden information, increasing the difficulty yet again. Your advisor may want to limit the scope of the project, e.g., concentrate on a variant with full observation. I don't know why you think this is simpler, except that the moves of the pieces are a little simpler.
Third, I think the right thing to do at first is to take a look at how games in general are handled in the field of AI. Russell and Norvig, chapters 3 (for general background) and 5 (for two player games) are pretty accessible and well-written. You'll see two basic ideas: One, that you're basically performing a huge search in a tree looking for a win, and two, that for any non-trivial game, the trees are too large, so you search to a certain depth and then cop out with a "board evaluation function" and look for one of those. I think your third bullet point is in this vein.
The board evaluation function is the magic, and probably a good candidate for using either a genetic algorithm, or a genetic program, either of which might be used in conjunction with a neural network. The basic idea is that you are trying to design (or evolve, actually) a function that takes as input a board position, and outputs a single number. Large numbers correspond to strong positions, and small numbers to weak positions. There is a famous paper by Chellapilla and Fogel showing how to do this for a game of Checkers:
http://library.natural-selection.com/Library/1999/Evolving_NN_Checkers.pdf
I think that's a great paper, tying three great strands of AI together: Adversarial search, genetic algorithms, and neural networks. It should give you some inspiration about how to represent your board, how to think about board evaluations, etc.
Be warned, though, that what you're trying to do is substantially more complex than Chellapilla and Fogel's work. That's okay-- it's 13 years later, after all, and you'll be at this for a while. You're still going to have a problem representing the board, because the AI player has imperfect knowledge of its opponent's state; initially, nothing is known but positions, but eventually as pieces are eliminated in conflict, one can start using First Order Logic or related techniques to start narrowing down individual pieces, and possibly even probabilistic methods to infer information about the whole set. (Some of these may be beyond the scope of an undergrad project.)
The fact you are having problems coming up with a representation for an actual strategy is not that surprising. In fact I would argue that it is the most challenging part of what you are attempting. Unfortunately, I haven't heard of Stratego so being a bit lazy I am going to assume you said chess.
The trouble is that a chess strategy is rather a complex thing. You suggest in your answer containing lots of transitions between board positions in the GA, but a chess board has more possible positions than the number of atoms in the universe this is clearly not going to work very well. What you will likely need to do is encode in the GA a series of weights/parameters that are attached to something that takes in the board position and fires out a move, I believe this is what you are hinting at in your second suggestion.
Probably the simplest suggestion would be to use some sort of generic function approximation like a neural network; Perceptrons or Radial Basis Functions are two possibilities. You can encode weights for the various nodes into the GA, although there are other fairly sound ways to train a neural network, see Backpropagation. You could perhaps encode the network structure instead/as well, this also has the advantage that I am pretty sure a fair amount of research has been done into developing neural networks with a genetic algorithm so you wouldn't be starting completely from scratch.
You still need to come up with how you are going to present the board to the neural network and interpret the result from it. Especially, with chess you would have to take note that a lot of moves will be illegal. It would be very beneficial if you could encode the board and interpret the result such that only legal moves are presented. I would suggest implementing the mechanics of the system and then playing around with different board representations to see what gives good results. A few ideas top of the head ideas to get you started could be, although I am not really convinced any of them are especially great ways to do this:
A bit string with all 64 squares one after another with a number presenting what is present in each square. Most obvious, but probably a rather bad representation as a lot of work will be required to filter out illegal moves.
A bit string with all 64 squares one after another with a number presenting what can move to each square. This has the advantage of embodying the covering concept of chess where you what to gain as much coverage of the board with your pieces as possible, but still has problems with illegal moves and dealing with friendly/enemy pieces.
A bit string with all 32 pieces one after another with a number presenting the location of that piece in each square.
In general though I would suggest that chess is rather a complex game to start with, I think it will be rather hard to get something playing to standard which is noticeably better than random. I don't know if Stratego is any simpler, but I would strongly suggest you opt for a fairly simple game. This will let you focus on getting the mechanics of the implementation correct and the representation of the game state.
Anyway hope that is of some help to you.
EDIT: As a quick addition it is worth looking into how standard chess AI's work, I believe most use some sort of Minimax system.
When you say "tactic", do you mean you want the GA to give you a general algorithm to play the game (i.e. evolve an AI) or do you want the game to use a GA to search the space of possible moves to generate a move at each turn?
If you want to do the former, then look into using Genetic programming (GP). You could try to use it to produce the best AI you can for a fixed tree size. JGAP already comes with support for GP as well. See the JGAP Robocode example for an instance of this. This approach does mean you need a domain specific language for a Stratego AI, so you'll need to think carefully how you expose the board and pieces to it.
Using GP means your fitness function can just be how well the AI does at a fixed number of pre-programmed games, but that requires a good AI player to start with (or a very patient human).
#DonAndre's answer is absolutely correct for movement. In general, problems involving state-based decisions are hard to model with GAs, requiring some form of GP (either explicit or, as #DonAndre suggested, trees that are essentially declarative programs).
A general Stratego player seems to me quite challenging, but if you have a reasonable Stratego playing program, "Setting up your Stratego board" would be an excellent GA problem. The initial positions of your pieces would be the phenotype and the outcome of the external Stratego-playing code would be the fitness. It is intuitively likely that random setups would be disadvantaged versus setups that have a few "good ideas" and that small "good ideas" could be combined into fitter-and-fitter setups.
...
On the general problem of what a decision tree, even trying to come up with a simple example, I kept finding it hard to come up with a small enough example, but maybe in the case where you are evaluation whether to attack a same-ranked piece (which, IIRC destroys both you and the other piece?):
double locationNeed = aVeryComplexDecisionTree();
if(thatRank == thisRank){
double sacrificeWillingness = SACRIFICE_GENETIC_BASE; //Assume range 0.0 - 1.0
double sacrificeNeed = anotherComplexTree(); //0.0 - 1.0
double sacrificeInContext = sacrificeNeed * SACRIFICE_NEED_GENETIC_DISCOUNT; //0.0 - 1.0
if(sacrificeInContext > sacrificeNeed){
...OK, this piece is "willing" to sacrifice itself
One way or the other, the basic idea is that you'd still have a lot of coding of Stratego-play, you'd just be seeking places where you could insert parameters that would change the outcome. Here I had the idea of a "base" disposition to sacrifice itself (presumably higher in common pieces) and a "discount" genetically-determined parameter that would weight whether the piece would "accept or reject" the need for a sacrifice.
So I was assigned the problem of writing a 5x5x5 tic-tac-toe player using a genetic algorithm. My approach was to start off with 3x3, get that working, and then extend to 5x5, and then to 5x5x5.
The way it works is this:
Simulate a whole bunch of games, and during each turn of each game, lookup in a corresponding table (X table or O table implemented as a c++ stdlib maps) for a response. If the board was not there, add the board to the table. Otherwise, make a random response.
After I have complete tables, I initialize a bunch of players (each with a copy of the board table, initialized with random responses), and let them play against each other.
Using their wins/losses to evaluate fitness, I keep a certain % of the best, and they move on. Rinse and repeat for X generations, and an optimal player should emerge.
For 3x3, discounting boards that were reflections/rotations of other boards, and boards where the move is either 'take the win' or 'block the win', the total number of boards I would encounter were either 53 or 38, depending on whether you go first or second. Fantastic! An optimal player was generated in under an hour. Very cool!
Using the same strategy for 5x5, I knew the size of the table would increase, but did not realize it would increase so drastically. Even discounting rotations/reflections and mandatory moves, my table is ~3.6 million entries, with no end in sight.
Okay, so that's clearly not going to work, I need a new plan. What if I don't enumerate all the boards, but just some boards. Well, it seems like this won't work either, because if each player has just a fraction of possible boards they might see, then they are going to be making a lot of random moves, clearly steering in the opposite direction of optimality.
What is a realistic way of going about this? Am I going to be stuck using board features? The goal is to hard-code as little game functionality as possible.
I've been doing research, but everything I read leads to min/max with A-B pruning as the only viable option. I can certainly do it that way, but the GA is really cool, my current method is just exceeding reality a bit here.
EDIT Problem has been pretty much solved:
Using a similarity function that combines hamming distance of open spaces, the possible win conditions, and a few other measures has brought the table down to a very manageable 2500 possibilities, which a std::map handles in a fraction of a second.
My knowledge of GA is pretty limited, but in modeling board configurations, aren't you asking the wrong question? Your task isn't to enumerate all the possible winning configurations -- what you're trying to do is to find a sequence of moves that leads to a winning configuration. Maybe the population you should be looking at isn't a set of boards, but a set of move sequences.
Edit: I wasn't thinking so much of starting from a particular board as starting from an empty board. It's obvious on a 3x3 board that move sequences starting with (1,1) work out best for X. The important thing isn't that the final board has an X in the middle, it's that the X was placed in the middle first. If there's one or more best first moves for X, maybe there's also a best second, third, or fourth move for X, too? After several rounds of fitness testing and recombining, will we find that X's second move is usually the same, or is one of a small set of values? And what about the third move?
This isn't minimax because you're not looking for the best moves one at a time based on the previous state of the board, you're looking for all the best moves at the same time, hoping to converge on a winning strategy.
I know this doesn't solve your problem, but if the idea is to evolve a winning strategy then it seems natural that you'd want to look at sequences of moves rather than board states.
This seems to be a very old conversation but attracted my attention. Thinking it might serve the public discussion, here is my input.
I think the aim in your assigned task needs to be defined more clearly:
Are you trying to find a set of winning boards? I don’t think so, because this is very straigtforward for a 3x3 board which can even be solved by hand, and it can be extrapolated to larger boards. GA could be utilized for larger boards, but it would only be a GA exercise.
Are you trying to utilize GA to train TicTacToe to AI players? I think this should be the case. In that case, your GA strings/chromosomes should not represent winning boards, but rather, they should represent ordered move sequences of players, for winning games. This is really a bit trickier to model though, as expected, and it would be a real AI training programming exercise.
I hope this perspective helps.
I remember when I was in college we went over some problem where there was a smart agent that was on a grid of squares and it had to clean the squares. It was awarded points for cleaning. It also was deducted points for moving. It had to refuel every now and then and at the end it got a final score based on how many squares on the grid were dirty or clean.
I'm trying to study that problem since it was very interesting when I saw it in college, however I cannot find anything on wikipedia or anywhere online. Is there a specific name for that problem that you know about? Or maybe it was just something my teacher came up with for the class.
I'm searching for AI cleaning agent and similar things, but I don't find anything. I don't know, I'm thinking maybe it has some other name.
If you know where I can find more information about this problem I would appreciate it. Thanks.
Perhaps a "stigmergy" approach is closely related to your problem. There is a starting point here, and you can find something by searching for "dead ants" and "robots" on google scholar.
Basically: instead of modelling a precise strategy you work toward a probabilistic approach. Ants (probably) collect their deads by piling up according to a simple rule such as "if there is a pile of dead ants there, I bring this corpse hither; otherwise, I'll make a new pile". You can start by simplifying your 'cleaning' situation with that, and see where you go.
Also, I think (another?) suitable approach could be modelled with a Genetic Algorithm using a carefully chosen combination of fitness functions such as:
the end number of 'clean' tiles
the number of steps made by the robot
of course if the robots 'dies' out of starvation it automatically removes itself from the gene pool, a-la darwin awards :)
You could start by modelling a very, very simple genotype that will be 'computed' into a behaviour. Consider using a simple GA such as this one by Inman Harvey, then to each gene assign either a part of the strategy, or a complete behaviour. E.g.: if gene A is turned to 1 then the robot will try to wander randomly; if gene B is also turned to 1, then it will give priority to self-charging unless there are dirty tiles at distance X. Or use floats and model probability. Your mileage may vary but I can assure it will be fun :)
The problem is reminiscent of Shakey, although there's cleaning involved (which is like the Roomba -- a device that can also be programmed to perform these very tasks).
If the "problem space" (or room) is small enough, you can solve for an optimal solution using a simple A*-based search, but likely it won't be, since that won't leave for very interesting problems.
The machine learning approach suggested here using genetic algorithms is an interesting approach. Given the problem domain you would only have one "rule" (a move-to action, since clean could be eliminated by implicitly cleaning any square you move to that is dirty) so your learner would essentially be learning how to move around an environment. The problem there would be to build a learner that would be adaptable to any given floor plan, instead of just becoming proficient at cleaning a very specific space.
Whatever approach you have, I'd also consider doing a further meta-reasoning step if the problem sets are big enough, and use a partition approach to divide the floor up into separate areas and then conquering them one at a time.
Can you use techniques to create data to use "offline"? In that case, I'd even consider creating a "database" of optimal routes to take to clean certain floor spaces (1x1 up to, say, 5x5) that include all possible start and end squares. This is similar to "endgame databases" that game AIs use to effectively "solve" games once they reach a certain depth (c.f. Chinook).
This problem reminds me of this. A similar problem is briefly mentioned in the book Complexity as an example of a genetic algorithm. These versions are simplified though, they don't take into account fuel consumption.