NetLogo: 2048 bot optimisation - artificial-intelligence

I am trying to make a Netlogo simulation of a 2048 game. I have implemented three heuristic functions determined by weight parameters and want to use behaviour space to run simulations and check what is the best strategy for winning this game.
Procedure search uses export/import-world primitives to search over possible moves and chooses the move for which the heuristic function has the highest value.
The problem is that this procedure is very slow (due to the import-world function which is being called four times each turn). Do you have any ideas how to implement this without exporting and importing world so often?
This is a project for my Introduction to AI class. It is due in a couple of days and I can't seem to find any solutions.
The relevant part of the code is below. Procedures move-(direction) all work properly and variable moveable? is true if the square can move in said direction and false otherwise. It is checked in procedure moveable-check called by move-(direction).
I would very much appreciate your help. :)
to search
let x 0
let direction "down"
export-world "state.csv"
move-up
ifelse not any? squares with [moveable?]
[set h-value -5000]
[set x h-value
set direction "up"
import-world "state.csv"]
export-world "state.csv"
move-down
ifelse not any? squares with [moveable?]
[set h-value -5000]
[if h-value > x
[set x h-value
set direction "down"]
import-world "state.csv"]
export-world "state.csv"
move-left
ifelse not any? squares with [moveable?]
[set h-value -5000]
[if h-value > x
[set x h-value
set direction "left"]
import-world "state.csv"]
export-world "state.csv"
move-right
ifelse not any? squares with [moveable?]
[set h-value -5000]
[if h-value > x
[set x h-value
set direction "right"]
import-world "state.csv"]
ifelse direction = "up"
[move-up
print "up"]
[ifelse direction = "down"
[move-down
print "down"]
[ifelse direction = "right"
[move-right
print "right"]
[move-left
print "left"]]]
if not any? squares with [moveable?]
[
ask squares [set heading heading + 90]
moveable-check
if not any? squares with [moveable?]
[ask squares [set heading heading + 90]
moveable-check
if not any? squares with [moveable?]
[ask squares [set heading heading + 90]
moveable-check
if not any? squares with [moveable?]
[stop]]]
]
end

The most important, and difficult, information you need to be able to save and restore is the squares. This is pretty easy to do without import-world and export-world (note that the following uses NetLogo 6 syntax; if you're still on NetLogo 5, you'll need to use the old task syntax in the foreach):
to-report serialize-state
report [(list xcor ycor value)] of squares
end
to restore-state [ state ]
clear-squares
foreach state [ [sq] ->
create-squares 1 [
setxy (item 0 sq) (item 1 sq)
set heading 0 ;; or whatever
set value item 2 sq
]
]
end
value above just shows how to store arbitrary variables of your squares. I'm not sure what data you have associated with them or need to restore. The idea behind this code is that you're storing the information about the squares in a list of lists, where each inner list contains the data for one square. The way you use this then is:
let state serialize-state
;; make changes to state that you want to investigate
restore-state state
You may need to store some globals and such as well. Those can be stored in local variables or in the state list (which is more general, but more difficult to implement).
A few other ideas:
Right now it looks like you're only looking one state ahead, and at only one possible position for the new square that's going to be placed (make sure you're not cheating by know where exactly the new square is going to be). Eventually, you may want to do arbitrary look ahead using a kind of tree search. This tree gets really big really fast. If you do that, you'll want to use pruning strategies such as: https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning . Also, that makes the state restoration stuff more difficult, but still doable. You'll be storing a stack of states rather than a single state.
Instead of set heading heading + 90 you can just do right 90 or rt 90.

Related

What am I doing wrong with this AI?

I am creating a very naive AI (it maybe shouldn't even be called an AI, as it just tests out a lot of possibilites and picks the best one for him), for a board game I am making. This is to simplify the amount of manual tests I will need to do to balance the game.
The AI is playing alone, doing the following things: in each turn, the AI, playing with one of the heroes, attacks one of the max 9 monsters on the battlefield. His goal is to finish the battle as fast as possible (in the least amount of turns) and with the fewest amount of monster activations.
To achieve this, I've implemented a think ahead algorithm for the AI, where instead of performing the best possible move at the moment, he selects a move, based on the possible outcome of future moves of other heroes. This is the code snippet where he does this, it is written in PHP:
/** Perform think ahead moves
*
* #params int $thinkAheadLeft (the number of think ahead moves left)
* #params int $innerIterator (the iterator for the move)
* #params array $performedMoves (the moves performed so far)
* #param Battlefield $originalBattlefield (the previous state of the Battlefield)
*/
public function performThinkAheadMoves($thinkAheadLeft, $innerIterator, $performedMoves, $originalBattlefield, $tabs) {
if ($thinkAheadLeft == 0) return $this->quantify($originalBattlefield);
$nextThinkAhead = $thinkAheadLeft-1;
$moves = $this->getPossibleHeroMoves($innerIterator, $performedMoves);
$Hero = $this->getHero($innerIterator);
$innerIterator++;
$nextInnerIterator = $innerIterator;
foreach ($moves as $moveid => $move) {
$performedUpFar = $performedMoves;
$performedUpFar[] = $move;
$attack = $Hero->getAttack($move['attackid']);
$monsters = array();
foreach ($move['targets'] as $monsterid) $monsters[] = $originalBattlefield->getMonster($monsterid)->getName();
if (self::$debug) echo $tabs . "Testing sub move of " . $Hero->Name. ": $moveid of " . count($moves) . " (Think Ahead: $thinkAheadLeft | InnerIterator: $innerIterator)\n";
$moves[$moveid]['battlefield']['after']->performMove($move);
if (!$moves[$moveid]['battlefield']['after']->isBattleFinished()) {
if ($innerIterator == count($this->Heroes)) {
$moves[$moveid]['battlefield']['after']->performCleanup();
$nextInnerIterator = 0;
}
$moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->performThinkAheadMoves($nextThinkAhead, $nextInnerIterator, $performedUpFar, $originalBattlefield, $tabs."\t", $numberOfCombinations);
} else $moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->quantify($originalBattlefield);
}
usort($moves, function($a, $b) {
if ($a['quantify'] === $b['quantify']) return 0;
else return ($a['quantify'] > $b['quantify']) ? -1 : 1;
});
return $moves[0]['quantify'];
}
What this does is that it recursively checks future moves, until the $thinkAheadleft value is reached, OR until a solution was found (ie, all monsters were defeated). When it reaches it's exit parameter, it calculates the state of the battlefield, compared to the $originalBattlefield (the battlefield state before the first move). The calculation is made in the following way:
/** Quantify the current state of the battlefield
*
* #param Battlefield $originalBattlefield (the original battlefield)
*
* returns int (returns an integer with the battlefield quantification)
*/
public function quantify(Battlefield $originalBattlefield) {
$points = 0;
foreach ($originalBattlefield->Monsters as $originalMonsterId => $OriginalMonster) {
$CurrentMonster = $this->getMonster($originalMonsterId);
$monsterActivated = $CurrentMonster->getActivations() - $OriginalMonster->getActivations();
$points+=$monsterActivated*($this->quantifications['activations'] + $this->quantifications['activationsPenalty']);
if ($CurrentMonster->isDead()) $points+=$this->quantifications['monsterKilled']*$CurrentMonster->Priority;
else {
$enragePenalty = floor($this->quantifications['activations'] * (($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left'])/$CurrentMonster->Enrage['max']));
$points+=($OriginalMonster->Health['left'] - $CurrentMonster->Health['left']) * $this->quantifications['health'];
$points+=(($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left']))*$enragePenalty;
}
}
return $points;
}
When quantifying some things net positive points, some net negative points to the state. What the AI is doing, is, that instead of using the points calculated after his current move to decide which move to take, he uses the points calculated after the think ahead portion, and selecting a move based on the possible moves of the other heroes.
Basically, what the AI is doing, is saying that it isn't the best option at the moment, to attack Monster 1, but IF the other heroes will do this-and-this actions, in the long run, this will be the best outcome.
After selecting a move, the AI performs a single move with the hero, and then repeats the process for the next hero, calculating with +1 moves.
ISSUE: My issue is, that I was presuming, that an AI, that 'thinks ahead' 3-4 moves, should find a better solution than an AI that only performs the best possible move at the moment. But my test cases show differently, in some cases, an AI, that is not using the think ahead option, ie only plays the best possible move at the moment, beats an AI that is thinking ahead 1 single move. Sometimes, the AI that thinks ahead only 3 moves, beats an AI that thinks ahead 4 or 5 moves. Why is this happening? Is my presumption incorrect? If so, why is that? Am I using wrong numbers for weights? I was investigating this, and run a test, to automatically calculate the weights to use, with testing an interval of possible weights, and trying to use the best outcome (ie, the ones, which yield the least number of turns and/or the least number of activations), yet the problem I've described above, still persists with those weights also.
I am limited to a 5 move think ahead with the current version of my script, as with any larger think ahead number, the script gets REALLY slow (with 5 think ahead, it finds a solution in roughly 4 minutes, but with 6 think ahead, it didn't even find the first possible move in 6 hours)
HOW THE FIGHT WORKS: The fight works in the following way: a number of heroes (2-4) controlled by the AI, each having a number of different attacks (1-x), which can be used once or multiple times in a combat, are attacking a number of monsters (1-9). Based on the values of the attack, the monsters lose health, until they die. After each attack, the attacked monster gets enraged if he didn't die, and after each heroes performed a move, all monsters get enraged. When the monsters reach their enrage limit, they activate.
DISCLAIMER: I know that PHP is not the language to use for this kind of operation, but as this is only an in-house project, I've preferred to sacrifice speed, to be able to code this as fast as possible, in my native programming language.
UPDATE: The quantifications that we currently use look something like this:
$Battlefield->setQuantification(array(
'health' => 16,
'monsterKilled' => 86,
'activations' => -46,
'activationsPenalty' => -10
));
If there is randomness in your game, then anything can happen. Pointing that out since it's just not clear from the materials you have posted here.
If there is no randomness and the actors can see the full state of the game, then a longer look-ahead absolutely should perform better. When it does not, it is a clear indication that your evaluation function is providing incorrect estimates of the value of a state.
In looking at your code, the values of your quantifications are not listed and in your simulation it looks like you just have the same player make moves repeatedly without considering the possible actions of the other actors. You need to run a full simulation, step by step in order to produce accurate future states and you need to look at the value estimates of the varying states to see if you agree with them, and make adjustments to your quantifications accordingly.
An alternative way to frame the problem of estimating value is to explicitly predict your chances of winning the round as a percentage on a scale of 0.0 to 1.0 and then choose the move that gives you the highest chance of winning. Calculating the damage done and number of monsters killed so far doesn't tell you much about how much you have left to do in order to win the game.

CNN with RGB input and BW binary output

I am a beginner to deep learning and I am working with Keras built on top of Tensorflow. I am trying to using RGB images (540 x 360) resolution to predict bounding boxes.
My labels are binary (black/white) 2 dimensional np array of dimensions (540, 360) where all pixels are 0 except for the box edges which are a 1.
Like this:
[[0 0 0 0 0 0 ... 0]
[0 1 1 1 1 0 ... 0]
[0 1 0 0 1 0 ... 0]
[0 1 0 0 1 0 ... 0]
[0 1 1 1 1 0 ... 0]
[0 0 0 0 0 0 ... 0]]
There can be more than one bounding box in every picture. A typical image could look like this:
So, my input has the dimension (None, 540, 360, 3), output has dimensions (None, 540, 360) but if I add an internal array I can change the shape to (None, 540, 360, 1)
How would I define a CNN model such that my model could fit this criteria? How can I design a CNN with these inputs and outputs?
You have do differentiate between object detection and object segmentation. While both can be used for similar problems, the underlying CNN architectures look very different.
Object detection models use a CNN classification/regression architecure, where the output refers to the coordinates of the bounding boxes. It's common practice to use 4 values belonging to vertical center, horizontal center, width and height of each bounding box. Search for Faster R-CNN, SSD or YOLO to find popular object detection models for keras. In your case you would need to define a function that converts the current labels to the 4 coordinates I mentioned.
Object segmentation models commonly use an architecture referred to as encoder-decoder networks, where the original image is scaled down and compressed on the first half and then brought back to it's original resolution to predict a full image. Search for SegNet, U-Net or Tiramisu to find popular object segmentation models for keras. My own implementation of U-Net can be found here. In your case you would need to define a custom function, that fills all the 0s inside your bounding boxes with 1s. Understand that this solution will not predict bounding boxes as such, but segmentation maps showing regions of interest.
What is right for you, depends on what precisely you want to achieve. For getting actual bounding boxes you want to perform an object detection. However, if you're interested in highlighting regions of interest that go beyond rectangle windows a segmentation may be a better fit. In theory, you can use your rectangle labels for a segmentation, where the network will learn to create better masks than the inaccurate segmentation of the ground truth, provided you have enough data.
This is a simple example of how to write intermediate layers to achieve the output. You can use this as a starter code.
def model_360x540(input_shape=(360, 540, 3),num_classes=1):
inputs = Input(shape=input_shape)
# 360x540x3
downblock0 = Conv2D(32, (3, 3), padding='same')(inputs)
# 360x540x32
downblock0 = BatchNormalization()(block0)
downblock0 = Activation('relu')(block0)
downblock0_pool = MaxPooling2D((2, 2), strides=(2, 2))(block0)
# 180x270x32
centerblock0 = Conv2D(1024, (3, 3), padding='same')(downblock0_pool)
#180x270x1024
centerblock0 = BatchNormalization()(center)
centerblock0 = Activation('relu')(center)
upblock0 = UpSampling2D((2, 2))(centerblock0)
# 180x270x32
upblock0 = concatenate([downblock0 , upblock0], axis=3)
upblock0 = Activation('relu')(upblock0)
upblock0 = Conv2D(32, (3, 3), padding='same')(upblock0)
# 360x540x32
upblock0 = BatchNormalization()(upblock0)
upblock0 = Activation('relu')(upblock0)
classify = Conv2D(num_classes, (1, 1), activation='sigmoid')(upblock0)
#360x540x1
model = Model(inputs=inputs, outputs=classify)
model.compile(optimizer=RMSprop(lr=0.001), loss=bce_dice_loss, metrics=[dice_coeff])
return model
The downblock represents the block of layers which perform downsampling(MaxPooling2D).
The centerblock has no sampling layer.
The upblock represents the block of layers which perform up sampling(UpSampling2D).
So here you can see how (360,540,3) is being transformed to (360,540,1)
Basically, you can add such blocks of layers to create your model.
Also check out Holistically-Nested Edge Detection which will help you better with the edge detection task.
Hope this helps!
I have not worked with keras but I will provide a solution approach in more generalized way which can be used on any framework.
Here is full procedure.
Data preparation: I know your labels are edges of boxes which will also work but i will recommend that instead of edges you prepare dataset marking complete box like given in sample (I have marked for two boxes). Now your dataset have three classes (Box,Edges of box and background). Create two lists, Image and label.
Get a pre-trained model (RESNET-51 recommended) solver and train prototxt from here, Remove fc1000 layer and add de-convolution/up-sampling layers to match your input size. use paddding in first layer to make it square and crop in deconvolution layer to match input output dimensions.
Transfer weights from previously trained network (Original) and train your network.
Test your dataset and create bounding boxes using detected blobs.

How to refresh a patch variable after tick?

I have a patch variable X which i want to be computed after each tick. I basically have a condition where at each tick i want to highlight only those patches whose X value is greater than the limit that i have placed.
This is what i have coded:
ask patches with [votes-with-benefit > 0] [ ifelse (b-c <= threshold)
[ set votes-with-benefit 0 set pcolor red ]
[ set votes-with-benefit votes
set pcolor scale-color white vote-share 0 max-voteshare ]
]
The problem is after the first tick even though there are patches who value is greater than the threshold they still appear red instead of reverting back to white.
Thanks in adavance. Appreciate the help.
Regards
It sounds like you need to switch the filter and the condition:
ask patches with [b-c <= threshold] [ ifelse (votes-with-benefit > 0)
[ set votes-with-benefit 0 set pcolor red ]
[ set votes-with-benefit votes
set pcolor scale-color white vote-share 0 max-voteshare ]
]
In any case, as you have it now, you will never reset those whose vote-with-benefits you set to zero, because you filter them out.
i figured it out. Sorry to have spammed. The patch variable was not updating after each tick because of two counter arguments i coded one after the other using two different 'ask patches command'. Once i combined them into one statement it started working. Thanks

How to make turtles face each other, wait 3 ticks and then keep wandering?

I am new to both, Netlogo and stackoverflow, but your other posts have already helped me a lot.
I am currently trying to program a model, where agents randomly wander a space and have them stop whenever they meet. "Meeting" here means "passing each other in-radius 2". They should face each other, wait for 2 ticks and then keep moving until they find the next agent.
I tried to use NzHelen's question on a timer, but did not really succeed.
So far, I managed to have them face each other. I have trouble putting the tick-command at the right place in my code. (EDIT: This got solved by taking out the wait-command, thanks to Seth. --> And I don't want all turtles to stop moving, but only the ones which are meeting each other).
One other thing which I am striving for is some kind of visual representation of them meeting, for instance have the patch blink for the time when they are meeting or a circle which shows up around them when they meet. With the wait-command, everything stops again, which I want to prevent.
Below the code so far.
to go
tick
ask turtles
[
wander
find-neighbourhood
]
ask turtles with [found-neighbour = "yes"]
[
face-each-other
]
ask turtles with [found-neighbour = "no" or found-neighbour = "unknown"]
[ wander ]
end
;-------
;Go commands
to wander
right random 50
left random 50
forward 1
end
to find-neighbourhood
set neighbourhood other turtles in-radius 2
if neighbourhood != nobody [wander]
find-nearest-neighbour
end
to find-nearest-neighbour
set nearest-neighbour one-of neighbourhood with-min [distance myself]
ifelse nearest-neighbour != nobody [set found-neighbour "yes"][set found-neighbour "no"]
end
to face-each-other ;;neighbour-procedure
face nearest-neighbour
set found-neighbour "no"
ask patch-here [ ;; patch-procedure
set pcolor red + 2
;wait 0.2
set pcolor grey + 2
]
if nearest-neighbour != nobody [wander]
rt 180
jump 2
ask nearest-neighbour
[
face myself
rt 180
jump 2
set found-neighbour "no"
]
end
With the help of a colleague I managed to solve my timer-issue. As Seth pointed out wait was not the right command and too many to-end-loops confused my turtles as well. The code now looks like the following and works. The turtles get close to each other, face each other, change their shape to stars, wait three ticks and then jump in the opposite directions.
to setup
clear-all
ask turtles
[
set count-down 3
]
reset-ticks
end
;---------
to go
ask turtles
[
if occupied = "yes" [
ifelse count-down > 0
[
set count-down (count-down - 1)
set shape "star"
][
set shape "default"
rt 180
fd 2
set occupied "no"
set count-down 3
]
]
if occupied = "no" [
; Wandering around, ignoring occupied agents
set neighbourhood other turtles in-radius 2
; If someone 'free' is near, communicate!
set nearest-neighbour one-of neighbourhood with-min [distance myself]
ifelse nearest-neighbour != nobody [
if ([occupied] of nearest-neighbour = "no") [
face nearest-neighbour
set occupied "yes"
ask nearest-neighbour [
face myself
set occupied "yes"
]]
][
; No one found, keep on wandering
wander
]]]
tick
end
;-------
;Go commands
to wander
right random 50
left random 50
forward 1
end
You're right to link to Nzhelen's question. Essentially the answer to your question is that you need to do the same thing. When you tried to do that, you were on the right track. I'd suggest taking another stab at it, and if you get stuck, show us exactly where you got stuck.

How to run Batch Photoshop Script to move layers X amount sequentially?

I have 70 layers in a photoshop file. I need to move X vertically, one after the other. So they'd look like:
>>Layer 1<<
>>Layer 2<<
>>Layer 3<<
Instead of just being stacked on top of each other. Not sure how to do this? Ideally, I should just specify an amount in pixels to Transform up.
A layer seems only be able to move with delta.
To move by delta use the MyLayer.transform(DeltaX,DeltaY); where MyLayer is a reference to the artLayer you want to move. The unit of DeltaX and DeltaY are the same as your Ruler in photoshop.
I wrote this little function to move a layer to an absolute position. I hope this will be of some use to you...
//******************************************
// MOVE LAYER TO
// Author: Max Kielland
//
// Moves layer fLayer to the absolute
// position fX,fY. The unit of fX and fY is
// the same as the ruler setting.
function MoveLayerTo(fLayer,fX,fY) {
var Position = fLayer.bounds;
Position[0] = fX - Position[0];
Position[1] = fY - Position[1];
fLayer.translate(-Position[0],-Position[1]);
}
Thanks a lot for this! Because of this tip I managed to complete a script that downloads/places (thousands of) map tiles... couldn't have done it without you ; )
I am new to 'Photoscripting' so I'd like to point out something (now obvious) that may also take other newbies than I a while to get: if you've calculated your 'fX' and 'fY' input through some mathematical means, be careful to explicitly add the unit you are using to your input number, otherwise you'll be placing things all over the place (waaaaaay off the canvas in my case ; P ).
Like this:
MoveLayerTo(myLayerRef, myX + "px", myY + "px").
Thanks a lot again, and cheers!

Resources