Optimizing difficulty with a level generator using Clingo - logic-programming

I’m making a puzzle level generator (sokoban) using Clingo and got stuck trying to create levels with a specific move limit. The idea is to use the move limit to control difficulty of a level. This is how I’ve created the generator:
Specify rules for a valid looking level (all walkable tiles are connected, correct ampunt of pushable barrels and target platforms, etc.)
Specify rules for moving around and pushing barrels.
Specify winning conditions.
My first iteration was to simply require that a level is solvable in N moves by requiring that winning conditions are met in 0..N moves. This resulted in valid levels but there was no control over the difficulty.
My second iteration was to minimize the moves required for a winning condition to be met. This resulted in levels that are completed in a single move.
My third iteration was to state that a level could not be completed in less than M moves. This resulted in the logic doing unnecessary moves just to reach the minimum, there was still no meaningful control over difficulty.
My fourth iteration was to prevent the same game state from reoccurring to avoid unnecessary ”filler” moves, but this still resulted in the logic solving the levels in very unoptimally just to reach the minimum requirement.
Now I don’t really know what to do. I think what I want to do is maximize the minimum required amount of moves (up to a limit), but I’m not sure if that makes any sense. Preferrably I’d like to be able to state that the minimum amount of moves for completing this level should be M.
How should this problem be approached?

So to put it in other terms: you are searching for a puzzle that can be solved in N moves but solving it in N-1 would return unsatisfiable. Those two parts state complimentary problems ("Find at least one" vs. "Find none/all") and are difficult to put in one program and the combination raises the problem complexity (see polynomial hierarchy). Here three suggestions to solve it via enumeration:
Enumeration and difference: For N and N-1 enumerate all answers. A puzzle is solvable in exactly N steps, when it appears for N but not for N-1. This requires some sort of post processing, like a script or neat text editing.
Enumeration and unsatisfiable: In a script enumerate all answer sets one by one for the current N. For the current puzzle, try to solve it for N-1: if it returns 0 answers, you have your puzzle. This requires data-handling of some form, I would suggest using python.
Enumeration avoiding doubles: In a script, for N-1 generate all answer sets. Transform each answer set into a constraint and add it to the program. Example: the tiles m(1..4,l). m(1..4,r) define your puzzle in general, your current answer set is m(1,l). m(2,l). m(3,r)., add the following constraint to avoid generating this puzzle again. Afterwards solve the updated program for N.
:- m(1,l), m(2,l), m(3,r), not m(1,r), not m(2,r), not m(3,l), not m(4,l), not m(4,r).
As you might see all three methods would solve your problem but require additional effort outside of clingo.

Related

what is the fastest way to check 3repetition of chess position?

I am wirting chess Ai as a project. if positon repetes 3 times it is a draw I can create array with all previus position then get every updated position iterete over every previus one of them and see if we have 2 same position in array. but this seem like a lot of work for computer and it will make calculating moves for Ai hard. is there any better way to do this?
I would suggest using Zobrist Hashing, which is designed to handle this kind of situation. Simply store a list of hash values for each position as you go along. You could also use a Bloom Filter.
It does not matter so much if you get some false positives if you keep track of the actual board configurations as well, so if you do get a collision, you can then quickly check if you have come across the current position before; this should not happen very often if you use sufficiently large hash values.
As Oliver mentioned in his answer you should use something like Zobrist Hashing, each position then have an (almost) unique number. Zobrist Hashing is hard to implement so you need to make sure your implementation is bug free, which is easier said than done. You then store this value in a list or similar that you loop over backwards to find how many time the same position has occured.
To make the lookup faster you only need to loop each other step since you are checking for a specific colors turn. You can also break the loop immediately if you find a pawn move or a capture since these moves make the position impossible to be repeated again.
You can look at Chessprogrmaming - Repetitions for more inspiration, especially the header "List of Keys".

Tree search based Game AI: How to avoid AI 'wandering'/'procrastination' with sparse rewards?

My game AI makes use of an algorithm that searches all possible future states based on the moves I can make (minimax / monte carlo esque). It evaluates these states using a scoring system, picks the highest scored final state and follows it.
This works well in most situations, but awfully when rewards are sparse. For example: there's a desirable collectable object that's 3 tiles to the right of me. The natural solution would be to go right->right->right.
But, my algorithm searches 6 turns deep. And it will will find many paths that eventually collect the object, including ones that take longer than 3 turns. It might for example find a path that's: up->right->down->right->right->down, collecting the object on turn 5 instead.
Since in both cases, the final leaf nodes detect the object as collected, it doesn't naturally prefer one or the other. So, instead of going right on turn 1, it might go up, or down, or left. This behavior will be repeated exactly on the next turn, so that it basically ends up dancing randomly in front of the collectable object, only luck will make it step on it.
That's clearly suboptimal and I want to fix it, but have run out of ideas how to handle this appropriately. Are there any solutions for this issue or is there any theoretical work that deals with handling this issue?
Solutions I've tried:
Make it value object collection more on earlier turns. While this works, to beat evaluator 'noise', the difference between turns must be quite high. Turn 1 must be rated higher than 2, turn 2 rated higher than 3, etc. The difference between turn 1 and 6 needs to be so high that it ends up making the behavior extremely greedy, which is not desirable in most situations. In an environment with multiple objects, it might end up choosing the path that grabs an object on turn 1, instead of the much better path that can grab objects on turn 5 and 6.
Assign the object as a target and value distance to it. If not done on a turn to turn basis, the original problem persists. If done on a turn to turn basis, the difference in importance required per turn once again makes it too greedy. This method also decreases flexibility and causes other issues. Target selection is not trivial and kind of ruins the point of a minimax style algorithm
Going much deeper in my searches so that it can always find a second object. This would cost so much computing power that I'd have to make concessions, like pruning paths much more aggressively. If I do so, I'll be back at the same problem since I won't know how to get it to prefer pruning the 5 turn version over the 3 turn version.
Give extra value to the plans laid out last turn. If it can at least follow the suboptimal path, there wouldn't be as much of an issue. Unfortunately, this once again has to be a pretty strong effect for it to work reliably, making it follow sub-optimal paths in all scenarios, hurting overall performance.
When weighting the outcome of the last step of your move, are you calculating in the number of moves needed to pick up an object?
I presume, you are quantifying each step of your move actions, giving a +1 if the step results in the picking up of an object. This means that in 3 steps, I can pick up the object with your above example, and get a +1 state of the play field, but I can also do this with 4-5-6-x steps, getting the same +1 state. If only a single object is reachable in the depth you are searching, your algorithm will likely select one of the random +1 states, giving you the above behaviour.
This could be solved, by quantifying with a negative score, each of the moves the AI must make. Thus, getting the object in 3 moves, will result in a -2, but getting the object in 6 moves, will result in -5. This way, the AI will clearly know, that it is preferable to get the object in the least amount of moves, ie, 3.

Evenly distribute scent in a collaborative diffusion matrix

I am trying to implement a collaborative diffusion behaviour for the first time and I am stuck with a problem. I understand how to make obstacles not diffusing scents and how to dampen scent for other friendly agents if one of them already pursues it. What I cannot understand is how do I make scents to evenly distribute in the matrix. It seems to me that every way of iterating in the matrix, determines the scent to distribute faster and better in the tiles I check later in the iteration. I mean if I iterate from i to maxRows and j to maxCols and then I apply the diffusion equation in every tile, on the 'north' and 'west' side of the goal I will have only one tile with the correct potential, whereas in the 'east' and 'south' side I will have more of them since their neighbours already have an assigned potential. How can I make the values distribute evenly? A double iteration from both extremities of the matrix and them combining the result seems like a memory-eater, as do a goal-oriented approach, since if I try to start from the goals and work around them I will have to execute the calculations for every goal and every tile with assigned potential, which means that I will have to do it for 4^(turn since starter diffusion)*nrOfGoals more every turn, which seems inefficient in a large matrix with a lot of goals.
My question is how can I evenly distribute the values in the matrix in an efficient way. I'm using the AiChallenge Ants, if that helps in any way!
I thank you in anticipation and I'm sorry for the grammar mistakes I've made in this post.
There may be a better solution, but the easiest way to do it is to use something similar to how a simple implementation of the game of life is done.
You have two buffers. One has the current "generation" of scent (and if you are doing multitasking, can be locked so only readers can look at it)... and another has the next generation of sent being calculated. You only "mix" scents from the current generation.
Once you are done, you swap the two buffers by simply changing the pointers / references.
Another way to think about it would be to have all the tiles calculate their new sent by asking their neighbors and averaging. When asked by their neighbors what their scent level is, they report their pre-calculated values from the previous pass. The new sent is only locked in once everyone has finished calculating.

Implementing a basic predator-prey simulation

I am trying to implement a predator-prey simulation, but I am running into a problem.
A predator searches for nearby prey, and eats it. If there are no near by prey, they move to a random vacant cell.
Basically the part I am having trouble with is when I advanced a "generation."
Say I have a grid that is 3x3, with each cell numbered from 0 to 8.
If I have 2 predators in 0 and 1, first predator 0 is checked, it moves to either cell 3 or 4
For example, if it goes to cell 3, then it goes on to check predator 1. This may seem correct
but it kind of "gives priority" to the organisms with lower index values.. I've tried using 2 arrays, but that doesn't seem to work either as it would check places where organisms are but aren't. ._.
Anyone have an idea of how to do this "fairly" and "correctly?"
I recently did a similar task in Java. Processing the predators starting from the top row to bottom not only gives "unfair advantage" to lower indices but also creates patterns in the movement of the both preys and predators.
I overcame this problem by choosing both row and columns in random ordered fashion. This way, every predator/prey has the same chance of being processed at early stages of a generation.
A way to randomize would be creating a linked list of (row,column) pairs. Then shuffle the linked list. At each generation, choose a random index to start from and keep processing.
More as a comment then anything else if your prey are so dense that this is a common problem I suspect you don't have a "population" that will live long. Also as a comment update your predators randomly. That is, instead of stepping through your array of locations take your list of predators and randomize them and then update them one by one. I think is necessary but I don't know if it is sufficient.
This problem is solved with a technique called double buffering, which is also used in computer graphics (in order to prevent the image currently being drawn from disturbing the image currently being displayed on the screen). Use two arrays. The first one holds the current state, and you make all decisions about movement based on the first array, but you perform the movement in the other array. Then, you swap their roles.
Edit: Looks like I didn't read your question thoroughly enough. Double buffering and randomization might both be needed, depending on how complex your rules are (but if there are no rules other than the ones you've described, randomization should suffice). They solve two distinct problems, though:
Double buffering solves the problem of correctness when you have rules where decisions about what will happen to a creature in a cell depends on the contents of neighbouring cells, and the decisions about neighbouring cells also depend on this cell. If you e.g. have a rule that says that if two predators are adjacent, they will both move away from each other, you need double buffering. Otherwise, after you've moved the first predator, the second one won't see any adjacent predator and will remain in place.
Randomization solves the problem of fairness when there are limited resources, such as when a prey only can be eaten by one predator (which seems to be the problem that concerned you).
How about some sort of round robin method. Put your predators in a circular linked list and keep a pointer to the node that's currently "first". Then, advance that first pointer to the next place in the list each generation. You could insert new predators either at the front or the back of your circular list with ease.

How do you solve the 15-puzzle with A-Star or Dijkstra's Algorithm?

I've read in one of my AI books that popular algorithms (A-Star, Dijkstra) for path-finding in simulation or games is also used to solve the well-known "15-puzzle".
Can anyone give me some pointers on how I would reduce the 15-puzzle to a graph of nodes and edges so that I could apply one of these algorithms?
If I were to treat each node in the graph as a game state then wouldn't that tree become quite large? Or is that just the way to do it?
A good heuristic for A-Star with the 15 puzzle is the number of squares that are in the wrong location. Because you need at least 1 move per square that is out of place, the number of squares out of place is guaranteed to be less than or equal to the number of moves required to solve the puzzle, making it an appropriate heuristic for A-Star.
A quick Google search turns up a couple papers that cover this in some detail: one on Parallel Combinatorial Search, and one on External-Memory Graph Search
General rule of thumb when it comes to algorithmic problems: someone has likely done it before you, and published their findings.
This is an assignment for the 8-puzzle problem talked about using the A* algorithm in some detail, but also fairly straightforward:
http://www.cs.princeton.edu/courses/archive/spring09/cos226/assignments/8puzzle.html
The graph theoretic way to solve the problem is to imagine every configuration of the board as a vertex of the graph and then use a breath-first search with pruning based on something like the Manhatten Distance of the board to derive a shortest path from the starting configuration to the solution.
One problem with this approach is that for any n x n board where n > 3 the game space becomes so large that it is not clear how you can efficiently mark the visited vertices. In other words there is no obvious way to assess if the current configuration of the board is identical to one that has previously been discovered through traversing some other path. Another problem is that the graph size grows so quickly with n (it's approximately (n^2)!) that it is just not suitable for a brue-force attack as the number of paths becomes computationally infeasible to traverse.
This paper by Ian Parberry A Real-Time Algorithm for the (n^2 − 1) - Puzzle describes a simple greedy algorithm that iteritively arrives at a solution by completing the first row, then the first column, then the second row... It arrives at a solution almost immediately, however the solution is far from optimal; essentially it solves the problem the way a human would without leveraging any computational muscle.
This problem is closely related to that of solving the Rubik's cube. The graph of all game states it too large to solve by brue force, but there is a fairly simple 7 step method that can be used to solve any cube in about 1 ~ 2 minutes by a dextrous human. This path is of course non-optimal. By learning to recognise patterns that define sequences of moves the speed can be brought down to 17 seconds. However, this feat by Jiri is somewhat superhuman!
The method Parberry describes moves only one tile at a time; one imagines that the algorithm could be made better up by employing Jiri's dexterity and moving multiple tiles at one time. This would not, as Parberry proves, reduce the path length from n^3, but it would reduce the coefficient of the leading term.
Remember that A* will search through the problem space proceeding down the most likely path to goal as defined by your heurestic.
Only in the worst case will it end up having to flood fill the entire problem space, this tends to happen when there is no actual solution to your problem.
Just use the game tree. Remember that a tree is a special form of graph.
In your case the leaves of each node will be the game position after you make one of the moves that is available at the current node.
Here you go http://www.heyes-jones.com/astar.html
Also. be mindful that with the A-Star algorithm, at least, you will need to figure out a admissible heuristic to determine whether a possible next step is closer to the finished route than another step.
For my current experience, on how to solve an 8 puzzle.
it is required to create nodes. keep track of each step taken
and get the manhattan distance from each following steps, taking/going to the one with the shortest distance.
update the nodes, and continue until reaches the goal

Resources