I have to create an AI which has to compete against other AIs.
Both AIs will run on the same hardware, have the same amount of processing time and memory. I know the opponent AI will be using the minimax algorithm with alpha beta pruning.
Now my question is - what are some approaches for beating such an opponent? If I use minimax myself - then both AI's perfectly predict each other's moves and the game resolves based on an inherent property of the game (first move wins etc).
The obvious solution is to somehow see further ahead into the possible moves which would allow for a better evaluation - since the processor time is the same I couldn't evaluate to a greater depth (assuming the opposing AI code is equally optimized). I could use a precomputed tree for an extra advantage but without a super computer I certainly couldn't "solve" any nontrivial game.
Is there some value in intentionally picking a non optimal node such as one that alpha beta would have pruned? This could potentially incur a CPU time penalty on the opponent as they'd have to go back and re-evaluate the tree. It would incur a penalty on me as well as I'd have to evaluate the minimax tree + alpha beta to see which nodes alpha beta would prune without reaping any direct benefits.
What are some other strategies for optimizing against such an opponent?
First, there isn't any value in choosing a non-optimal line of play. Assuming your opponent will play optimally (and that's a fundamental assumption of minimax search), your opponent will make a move that capitalizes on the mistake. A good game engine will have a hashed refutation table entry containing the countermove for your blunder, so you'll gain no time by making a wild move. Making bad moves allows a computer opponent to find good moves faster.
The key thing to realize with a game like Othello is that you can't be sure what the optimal move is until late in the game. That's because the search tree is almost always too large to be exhaustively searched for all won or lost positions, and so minimax can't tell you with certainty which moves will lead to victory or defeat. You can only heuristically decide where to stop searching, arbitrarily call those nodes "terminal", and then run an evaluation function that guesses the win/loss potential of a position.
The evaluation function's job is to assess the value of a position, typically using static metrics that can be computed without searching the game tree further. Piece counts, positional features, endgame tablebases, and even opponent psychology can play a role here. The more intelligence you put into your evaluation function, generally the better your engine will play. But the point of static evaluation is replace searches that would be too expensive. If your evaluation function does too much or does what it does too inefficiently, it can become slower than the game tree search needed to obtain the same information. Knowing what to put in an evaluation function and when to use static evaluation instead of search is a large part of the art of writing a good game engine.
There are a lot of ways to improve standard minimax with AB pruning. For example, the killer heuristic attempts to improve the order moves are looked at, since AB's efficiency is better with well-ordered moves.
A lot of information on different search enhancements and variations on AB can be found at chessprogramming.wikispaces.com.
Related
I have read in some articles on evolutionary computing that the algorithms generally converge to a single solution due to the phenomenon of genetic drift. There is a lot of content on the Internet, but I can't get a deep understanding of this concept. I need to know, simply and precisely:
What is genetic drift in the context of evolutionary computing?
How does it affect the convergence of an evolutionary algorithm?
To better understand the original concept of genetic drift (Biology), I suggest you read this Khan Academy's article. Simply put, you can think of it as an evolutionary phenomenon in which the frequency of one or more alleles (versions of a gene) in a population changes due to random factors (unrelated to the fitness of each individual). If the fittest individual of a population is struck, out of bad luck, by a lightning and dies before reproducing, he won't leave offspring (although he has the highest fitness!). This is an example (somewhat absurd, I know) of genetic drift.
Now, in the specific context of evolutionary algorithms, this paper provides a good summary on the subject:
EAs genetic drift can be as a result of a combination of factors,
primarily related to selection, fitness function and representation.
It happens by unintentional loss of genotypes. For example, random
chance that a good genotype solution never gets selected for
reproduction. Or, if there is a ‘lifespan’ to a solution and it dies
before it can reproduce. Normally such a genotype only resides in the
population for a limited number of generations.
(Sloss & Gustafson, 2019)
Finally, I will give you a real example of genetic drift acting on a genetic algorithm. Recently, I've used a simple neuroevolution algorithm to create an agent capable of playing the Snake game (GitHub repo). In my implementation of the game, the apples appear in random positions of the screen. When executing the evolutionary process for the first time, I noticed a big fluctuation in the population's best fitness between consecutive generations - overall, it wasn't improving much. Because of this, my algorithm was unable to converge to a good solution.
After some debugging, I found out that this was being caused by genetic drift. Because the apples spawned in random positions, some individuals, not necessarily the fittest, were lucky and got "easy apples", thus achieving a high fitness and leaving more offspring. Do you see the problem here?
Suppose that snake A is better at the game than snake B, because it can move towards the food, while B only moves randomly. Now, suppose that the first food that appeared for snake A was in a corner of the screen (a difficult position) and A died shortly after eating the apple. Now, suppose that snake B was lucky enough to have 3 apples spawn in a row, one after the other. Although B is "dumber" than A, it will leave more offspring, because it achieved a greater fitness. B's offspring will "pollute" the next generation, because they'll probably be "dumb" like B.
I solved the problem using a better apple positioning algorithm (I defined a minimum distance between the spawning position of two consecutive apples) and by calculating each individual's final fitness as the average of its fitness in several playing sessions. This greatly reduced (although it did not eliminate) the interference of genetic drift in my algorithm.
I hope this helps. You can also take a look at this video (it's in Portuguese, but English subtitles are available), where I explain some of the strategies I used to make the Snake AI.
I am building a 2048 AI, and it is leading to a rather peculiar observation (peculiar enough to me).
The optimizations are not up to the mark right now (coupled with the fact that the code is written in python), which is letting me reach till only a depth of 3 moves (plies).
As evident from the results, Expectimax is quite dominant over minimax (similar results can be seen without alpha-beta pruning in minimax) in terms of results produced. Both use the same evaluation function and do not proceed any further than 3 moves.
AFAIK, minimax should work optimally in such games, but that doesn't seem to be the case here. My question is, this observation is due to the fact that:
I am not going deep enough into the search tree?
2048 is a stochastic game, and that is hampering the performance of minimax (or boosting the performance of expectimax)?
The opponent (the 2048 game logic) is not playing optimally (90-10 % chance of putting a 2-4 tile, random adversary) (if yes, then why should this affect the performance of minimax)?
Anything else that is not apparent to me?
I am programming a Chess AI using an alpha-beta pruning algorithm that works at fixed depth. I was quite surprised to see that by setting the AI to a higher depth, it played even worse. But I think I figured it why so.
It currently works that way : All positions are listed, and for each of them, every other positions from that move is listed and so on... Until the fixed depth is reached : the board is evaluated by checking what pieces are present and by setting a value for every piece types. Then, the value bubbles up to the root using the minimax algorithm with alpha-beta.
But I need to account for the move order. For instance, there is two options, a checkmate in 2 moves, and another in 7 moves, then the first one has to be chosen. The same thing goes to taking a queen in whether 3 or 6 moves.
But since I only evaluate the board at the deepest nodes and that I only check the board as the evaluation result, it doesn't know what was the previous moves were.
I'm sure there is a better way to evaluate the game that can account for the way the pieces moved through the search.
EDIT: I figured out why it was playing weird. When I searched for moves (depth 5), it ended with a AI move (a MAX node level). By doing so, it counted moves such as taking a knight with a rook, even if it made the latter vulnerable (the algorithm cannot see it because it doesn't search deeper than that).
So I changed that and I set depth to 6, so it ends with a MIN node level.
Its moves now make more sense as it actually takes revenge when attacked (what it sometimes didn't do and instead played a dumb move).
However, it is now more defensive than ever and does not play : it moves its knight, then moves it back to the place it was before, and therefore, it ends up losing.
My evaluation is very standard, only the presence of pieces matters to the node value so it is free to pick the strategy it wants without forcing it to do stuff it doesn't need to.
Consedering that, is that a normal behaviour for my algorithm ? Is it a sign that my alpha-beta algorithm is badly implemented or is it perfectly normal with such an evaluation function ?
If you want to select the shortest path to a win, you probably also want to select the longest path to a loss. If you were to try to account for this in the evaluation function, you would have to the path length along with the score and have separate evaluation functions for min and max. It's a lot of complex and confusing overhead.
The standard way to solve this problem is with an iterative deepening approach to the evaluation. First you search deep enough for 1 move for all players, then you run the entire search again searching 2 moves for each player, etc until you run out of time. If you find a win in 2 moves, you stop searching and you'll never run into the 7 moves situation. This also solves your problem of searching odd depths and getting strange evaluations. It has many other benefits, like always having a move ready to go when you run out of time, and some significant algorithmic improvements because you won't need the overhead of tracking visited states.
As for the defensive play, that is a little bit of the horizon effect and a little bit of the evaluation function. If you have a perfect evaluation function, the algorithm only needs to see one move deep. If it's not perfect (and it's not), then you'll need to get much deeper into search. Last I checked, algorithms that can run on your laptop and see about 8 plys deep (a ply is 1 move for each player) can compete with strong humans.
In order to let the program choose the shortest checkmate, the standard approach is to give a higher value to mates that occur closer to the root. Of course, you must detect checkmates, and give them some score.
Also, from what you describe, you need a quiescence search.
All of this (and much more) is explained in the chess programming wiki. You should check it out:
https://chessprogramming.wikispaces.com/Checkmate#MateScore
https://chessprogramming.wikispaces.com/Quiescence+Search
I am making an expectimax AI, and the branching factor of this game is unpredictable, ranging from 6 - 20. I'm currently exploring the game tree for 1 second every turn, then making sure the whole game tree is explored to the same depth, but occasionally this results in a very large slowdown, if branching factor for a particular turn jumps up radically. Is if OK if I cut off exploration when parts of the game tree are not explored as deeply? Will this affect the mathematical properties of expectimax at all?
Short answer: I'm pretty sure you lose the mathematical guarantees, but the extent to which this affects your program's performance will probably depend on the game and your board evaluation function.
Here's an abstract scenario to give you some intuition for where having different branch lengths might create the most problems: say that, for player one, the best move is something that takes a few turns to set up. Let's say this set up is not something that your board evaluation function can pick up on. In this case, regardless of what player 2 does in the mean time, there will be a point a few moves in the future where the score of the board will swing in the direction that favors player 1. If one branch gets far enough to see that move and another doesn't, it will look like the first is a worse option for player 2, despite the fact that the same thing will happen on the other branch. If the move that player 2 made in the first branch was actually better than the move it made in the second branch, this would lead to a suboptimal choice.
On the other hand, a perfect board evaluator would make this impossible (it would recognize player 1 setting up their move). There are also games in which setting up moves in advance like this is not possible. But the existence of this case is a red flag.
Fundamentally, branches that didn't get evaluated as far have greater uncertainty in their estimations of how good a move is. This will sometimes result in them being chosen when they shouldn't be and other times it will result in them not being chosen when they should be. As a result, I would strongly suspect that you lose mathematical guarantees by doing this. That said, the practical impact of this problem on performance may or may not be substantial.
There might be some way around this if you incorporate the current turn number into the board evaluation function and adjust for it accordingly. Minimally, that would allow you to explicitly account for the added uncertainty in shorter branches.
There are plenty of Chess AI's around, and evidently some are good enough to beat some of the world's greatest players.
I've heard that many attempts have been made to write successful AI's for the board game Go, but so far nothing has been conceived beyond average amateur level.
Could it be that the task of mathematically calculating the optimal move at any given time in Go is an NP-complete problem?
Chess and Go are both EXPTIME complete. IIRC, Go has more possible moves, so I think it a higher multiple of that complexity class than chess. Wikipedia has a good article on the complexity of Go.
Even if Go is merely in P it could still be something horrendous like O(n^m) where n is the number of spaces and m is some (large) fixed number. Even being in P doesn't make something reasonable to compute.
Neither Chess or Go AIs completely evaluate all possibilities before deciding on a move.
Chess AIs use various heuristics to narrow down the search space, and to quantify how 'good' a given position on the board happens to be. This can be done recursively by evaluating possible board positions 14-15 moves ahead and choosing a path that leads to a good position.
There's a bit of 'magic' in how a board position is quantified so the that at the top level, the AI can simply go Move A > Move B therefore lets do Move A. But since there's a limited number of pieces and they all have quantifiable value a 'good enough' algorithm can be implemented.
But it turns out to be a lot harder for a program to evaluate two possible board positions in Go and make that A > B calculation. Without that critical piece its a little hard to make the rest of the AI work.