Better Heuristic function for a game (AI Minimax) - artificial-intelligence

There is a game that I've programmed in java. The game is simple (refer to the figure below). There are 4 birds and 1 larva. It is a 2 player game (AI vs Human).
Larva can move diagonally forward AND diagonally backward
Birds can ONLY move diagonally forward
Larva wins if it can get to line 1 (fence)
Larva also wins if birds have no moves left
Birds CANNOT "eat" the larva.
Birds win if Larva has NO move left (cannot move at all)
When the game starts, Larva begins, then ONE bird can move (any one), then Larva, etc...
I have implemented a MiniMax (Alpha Beta Pruning) and I'm using the following evaluate() function (heuristic function).
Let us give the following numbers to each square on the board.
Therefore, our evaluation function will be
h(n) = value of position of larva - value of position of bird 1 - value of position of bird 2 - value of position of bird 3 - value of position of bird 4
the Larva will try to MAXIMIZE the heuristic value whereas the Birds will try to MINIMIZe it
Example:
However, this is a simple and naive heuristic. It does not act in a smart manner. I am a beginner in AI and I would like to know what can I do to IMPROVE this heuristic function?
What would be a good/informed heuristic?

How about this :
Maximum :larva
Minimum :birds
H(t)=max_distance(larva,line_8)+Σmin_distance(bird_n,larva)
or
H(t)=Σmin_distance(bird_n,larva) - min_distance(larva,line_1)
max_distance(larva,line_8): to reflect the condition that larva is closer to the line 1.
Σmin_distance(bird_n,larva): to reflect the condition that birds are closer to the larva(to block it).
I believe there are still many thing could be considered ,for example ,the bird closest to the larva should have high priority to be chosen to move, but the direction about the function above make sense , and many details can be thought to improve it easily.

There is 1 simple way to improve your heuristic considerably. In your current heuristic, the values of square A1 is 8 less than the value of square A8. This makes the Birds inclined to move towards the left side of the game board, as a move to the left will always be higher than a move to the right. This is nit accurate. All squares on row 1 should have the same value. Thus assign all squares in row 1 a 1, in row 2 a 2, etc. This way the birds and larva won't be inclined to move to the left, and instead can focus on making a good move.

You could take into account the fact that birds will have a positional advantage over the larva when the larva is on the sides of the board, so if Larva is MAX then change the side tile values of the board to be smaller.

Related

Heuristic function for Pylos game

Pylos is a game constituted of a 4x4 pyramid board (4x4 below a 3x3 below a 2x2 below a 1). There are two players, one with White marbles and the other with Black marbles.
Each player has 15 marbles initially and takes turns placing a marble on the board on a free square (or if it is on a higher level, the 4 'support' squares of the lower level must be occupied).
The goal is that the opponent has no more marbles in his stock.
If you complete a square of marbles of the same color, you can remove two of your marbles from the board.
If you can move a marble to a higher level, you can do so (you save putting down a marble).
In short, my goal is to implement the best possible strategy for this game. For that, I have implemented a MinMax and I need a heuristic evaluation function. I can't go deeper than depth 4 in MinMax.
The naive heuristic returns the difference between my number of marbles in stock and that of my opponent.
I have tried to improve the heuristic by implementing same-color square detection, and upward movement detection, and also by giving importance to constraining an opponent's move, but the naive strategy sometimes still wins if it plays 2nd.
If you have any ideas for improvement, I would be grateful.

Proper Heuristic Mechanism For Hill Climbing

The following problem is an exam exercise I found from an Artificial Intelligence course.
"Suggest a heuristic mechanism that allows this problem to be solved, using the Hill-Climbing algorithm. (S=Start point, F=Final point/goal). No diagonal movement is allowed."
Since it's obvious that Manhattan Distance or Euclidean Distance will send the robot at (3,4) and no backtracking is allowed, what is a possible solution (heuristic mechanism) to this problem?
EDIT: To make the problem clearer, I've marked some of the Manhattan distances on the board:
It would be obvious that, using Manhattan distance, the robot's next move would be at (3,4) since it has a heuristic value of 2 - HC will choose that and get stuck forever. The aim is try and never go that path by finding the proper heuristic algorithm.
I thought of the obstructions as being hot, and that heat rises. I make the net cost of a cell the sum of the Manhattan metric distance to F plus a heat-penalty. Thus there is an attractive force drawing the robot towards F as well as a repelling force which forces it away from the obstructions.
There are two types of heat penalties:
1) It is very bad to touch an obstruction. Look at the 2 or 3 cells neighboring cells in the row immediately below a given cell. Add 15 for every obstruction cell which is directly below the given cell and 10 for every diagonal neighbor which is directly below
2) For cells not in direct contact with the instructions -- the heat is more diffuse. I calculate it as 6 times the average number of obstruction blocks below the cell both in its column and in its neighboring columns.
The following shows the result of combining this all, as well as the path taken from S to F:
A crucial point it the way that the averaging causes the robot to turn left rather than right when it hits the top row. The unheated columns towards the left make that the cooler direction. It is interesting to note how all cells (with the possible exception of the two at the upper-right corner) are drawn to F by this heuristic.

Does the min player in the minimax algorithm play optimally?

In the minimax algorithm, the first player plays optimally, which means it wants to maximise its score, and the second player tries to minimise the first player's chances of winning. Does this mean that the second player also plays optimally to win the game? Trying to choose some path in order to minimise the first player's chances of winning also means trying to win?
I am actually trying to solve this task from TopCoder: EllysCandyGame. I wonder whether we can apply the minimax algorithm here. That statement "both to play optimally" really confuses me and I would like some advice how to deal with this type of problems, if there is some general idea.
Yes, you can use the minimax algorithm here.
The problem statement says that the winner of the game is "the girl who has more candies at the end of the game." So one reasonable scoring function you could use is the difference in the number of candies held by the first and second player.
Does this mean that the second player also plays optimally to win the game?
Yes. When you are evaluating a MIN level, the MIN player will always choose the path with the lowest score for the MAX player.
Note: both the MIN and MAX levels can be implemented with the same code, if you evaluate every node from the perspective of the player making the move in that round, and convert scores between levels. If the score is a difference in number of candies, you could simply negate it between levels.
Trying to choose some path in order to minimize the first player's chances of winning also means trying to win?
Yes. The second player is trying to minimize the first player's score. A reasonable scoring function will give the first player a lower score for a loss than a tie.
I wonder whether we can apply the minimax algorithm here.
Yes. If I've read the problem correctly, the number of levels will be equal to the number of boxes. If there's no limit on the number of boxes, you'll need to use an n-move lookahead, evaluating nodes in the minimax tree to a maximum depth.
Properties of the game:
At each point, there are a limited, well defined number of moves (picking one of the non-empty boxes)
The game ends after a finite number of moves (when all boxes are empty)
As a result, the search tree consists of a finite number of leafs. You are right that by applying Minimax, you can find the best move.
Note that you only have to evaluate the game at the final positions (when there are no more moves left). At that point, there are only three results: The first player won, the second player won, or it is a draw.
Note that the standard Minimax algorithm has nothing to do with probabilities. The result of the Minimax algorithm determines the perfect play for both side (assuming that both sides make no mistakes).
By the way, if you need to improve the search algorithm, a safe and simple optimization is to apply Alpha Beta pruning.

A* search algorithm heuristic function

I am trying to find the optimal solution to a Sliding Block Puzzle of any length using the A* algorithm.
The Sliding Block Puzzle is a game with white (W) and black tiles (B) arranged on a linear game board with a single empty space(-). Given the initial state of the board, the aim of the game is to arrange the tiles into a target pattern.
For example my current state on the board is BBW-WWB and I have to achieve BBB-WWW state.
Tiles can move in these ways :
1. slide into an adjacent empty space with a cost of 1.
2. hop over another tile into the empty space with a cost of 1.
3. hop over 2 tiles into the empty space with a cost of 2.
I have everything implemented, but I am not sure about the heuristic function. It computes the shortest distance (minimal cost) possible for a misplaced tile in current state to a closest placed same color tile in goal state.
Considering the given problem for the current state BWB-W and goal state BB-WW the heuristic function gives me a result of 3. (according to minimal distance: B=0 + W=2 + B=1 + W=0). But the actual cost of reaching the goal is not 3 (moving the misplaced W => cost 1 then the misplaced B => cost 1) but 2.
My question is: should I compute the minimal distance this way and don't care about the overestimation, or should I divide it by 2? According to the ways tiles can move, one tile can for the same cost overcome twice as much(see moves 1 and 2).
I tried both versions. While the divided distance gives better final path cost to the achieved goal, it visits more nodes => takes more time than the not divided one. What is the proper way to compute it? Which one should I use?
It is not obvious to me what an admissible heuristic function for this problem looks like, so I won't commit to saying, "Use the divided by two function." But I will tell you that the naive function you came up with is not admissible, and therefore will not give you good performance. In order for A* to work properly, the heuristic used must be admissible; in order to be admissible, the heuristic must absolutely always give an optimistic estimate. This one doesn't, for exactly the reason you highlight in your example.
(Although now that I think about it, dividing by two does seem like a reasonable way to force admissibility. I'm just not going to commit to it.)
Your heuristic is not admissible, so your A* is not guaranteed to find the optimal answer every time. An admissible heuristic must never overestimate the cost.
A better heuristic than dividing your heuristic cost by 3, would be: instead of adding the distance D of each letter to its final position, add ceil(D/2). This way, a letter 1 or 2 away, gets a 1 value, 3 or 4 away, gets a 2 value, an so on.

utility functions minimax search

Hi
I'm confused how you can determine the utility functions on with a minimax search
Explain it with any game that you can use a minimax search with
Basically i am asking how do you determine the utility functions
Cheers
The utility value is just some arbitrary value that the player receives when arriving at a certain state in the game. For instance, in Tic-tac-toe, your utility function could simply be 1 for a win, 0 for a tie, or -1 for a loss.
Running minmax on this would at best find a set of actions that result in 1 (a win).
Another example would be chess (not that you can feasibly run minimax on a game of chess). Say your utility function comes from a certain number that is based on the value of the piece you captured or lost
Determining the utility value of a move at a certain state has to do with the experience of the programmer and his/her knowledge of the game.
Utility values on a terminal state are kind of easy to determine. In Tic-tac-toe, for instance, a terminal state for player X is when the Xs are aligned in diagonal, vertically, or horizontally. Any move that creates such a state is a terminal state and you can create a function that checks that. If it is a terminal state, the function returns a 1 or -1.
If your player agent is player X and after player X's move it determines that player O will win, then the function returns a -1. The function returns a 1 if it determines that it is its own winning move.
If all cells are occupied with the last possible move and nobody has won, then the function returns a zero.
This is at terminal states only. It is critical to evaluate intermediate states because, even in a 3x3 game, there are lots of combinations to consider. If you include symmetrical moves you have 9! possible states in Tic-tac-toe. For those intermediate cases, you need to come up with an evaluation function that returns a score for each state as they related to other states.
Suppose that I assign the terminal state values of 810, 0, and -810. For each move, the score would be 810 / (# of moves). So if I reach a terminal state in 6 moves, the score would be 810/6 = 135. In 9 moves, the score would be 90. An evaluation function fashioned this way would favor moves that reach a terminal state faster. However, it still evaluates to a leaf node. We need to evaluate before reaching a leaf node, though, but this could also be part of an evaluation function.
Supposed that, in the game below, player 1 is X. So X moves next. The following are the legal moves (row, column) for X:
(1) 0,0
(2) 0,2
(3) 2,0
(4) 2,1
(5) 2,2
| |O| |
|O|X|X|
| | | |
The utility value for each move should favor the best moves.
The best moves, in this case, are either (2) or (5). So an evaluation function will assign a utility value of 81, for instance to each of those. Move (4) is the worst possible move for the X player (and would also warranty that you lose the game against an intelligent player) so the function would assign a value of -9 to that move. Moves (1) and (3), while not ideal, will not make you lose, so we might assign a 1.
So when minimax evaluates those 5 moves, because your player X, is max, the choice would be either (2) or (5).
If we focus on options (2) or (5), the game will be on a terminal state two moves after these. So, in reality, the evaluation function should look 2 moves ahead of the current legal moves to return the utility values. (This strategy follows the lines of depth limited search, where your function evaluates at a certain depth and produces a utility value without reaching a leaf node - or terminal state)
Now I'll circle back to my first statement. The utility value will be determined by an evaluation function coded per the programmer's knowledge of the game.
Hopefully, I'm not confusing you...

Resources