What am I doing wrong with this AI? - artificial-intelligence

I am creating a very naive AI (it maybe shouldn't even be called an AI, as it just tests out a lot of possibilites and picks the best one for him), for a board game I am making. This is to simplify the amount of manual tests I will need to do to balance the game.
The AI is playing alone, doing the following things: in each turn, the AI, playing with one of the heroes, attacks one of the max 9 monsters on the battlefield. His goal is to finish the battle as fast as possible (in the least amount of turns) and with the fewest amount of monster activations.
To achieve this, I've implemented a think ahead algorithm for the AI, where instead of performing the best possible move at the moment, he selects a move, based on the possible outcome of future moves of other heroes. This is the code snippet where he does this, it is written in PHP:
/** Perform think ahead moves
*
* #params int $thinkAheadLeft (the number of think ahead moves left)
* #params int $innerIterator (the iterator for the move)
* #params array $performedMoves (the moves performed so far)
* #param Battlefield $originalBattlefield (the previous state of the Battlefield)
*/
public function performThinkAheadMoves($thinkAheadLeft, $innerIterator, $performedMoves, $originalBattlefield, $tabs) {
if ($thinkAheadLeft == 0) return $this->quantify($originalBattlefield);
$nextThinkAhead = $thinkAheadLeft-1;
$moves = $this->getPossibleHeroMoves($innerIterator, $performedMoves);
$Hero = $this->getHero($innerIterator);
$innerIterator++;
$nextInnerIterator = $innerIterator;
foreach ($moves as $moveid => $move) {
$performedUpFar = $performedMoves;
$performedUpFar[] = $move;
$attack = $Hero->getAttack($move['attackid']);
$monsters = array();
foreach ($move['targets'] as $monsterid) $monsters[] = $originalBattlefield->getMonster($monsterid)->getName();
if (self::$debug) echo $tabs . "Testing sub move of " . $Hero->Name. ": $moveid of " . count($moves) . " (Think Ahead: $thinkAheadLeft | InnerIterator: $innerIterator)\n";
$moves[$moveid]['battlefield']['after']->performMove($move);
if (!$moves[$moveid]['battlefield']['after']->isBattleFinished()) {
if ($innerIterator == count($this->Heroes)) {
$moves[$moveid]['battlefield']['after']->performCleanup();
$nextInnerIterator = 0;
}
$moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->performThinkAheadMoves($nextThinkAhead, $nextInnerIterator, $performedUpFar, $originalBattlefield, $tabs."\t", $numberOfCombinations);
} else $moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->quantify($originalBattlefield);
}
usort($moves, function($a, $b) {
if ($a['quantify'] === $b['quantify']) return 0;
else return ($a['quantify'] > $b['quantify']) ? -1 : 1;
});
return $moves[0]['quantify'];
}
What this does is that it recursively checks future moves, until the $thinkAheadleft value is reached, OR until a solution was found (ie, all monsters were defeated). When it reaches it's exit parameter, it calculates the state of the battlefield, compared to the $originalBattlefield (the battlefield state before the first move). The calculation is made in the following way:
/** Quantify the current state of the battlefield
*
* #param Battlefield $originalBattlefield (the original battlefield)
*
* returns int (returns an integer with the battlefield quantification)
*/
public function quantify(Battlefield $originalBattlefield) {
$points = 0;
foreach ($originalBattlefield->Monsters as $originalMonsterId => $OriginalMonster) {
$CurrentMonster = $this->getMonster($originalMonsterId);
$monsterActivated = $CurrentMonster->getActivations() - $OriginalMonster->getActivations();
$points+=$monsterActivated*($this->quantifications['activations'] + $this->quantifications['activationsPenalty']);
if ($CurrentMonster->isDead()) $points+=$this->quantifications['monsterKilled']*$CurrentMonster->Priority;
else {
$enragePenalty = floor($this->quantifications['activations'] * (($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left'])/$CurrentMonster->Enrage['max']));
$points+=($OriginalMonster->Health['left'] - $CurrentMonster->Health['left']) * $this->quantifications['health'];
$points+=(($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left']))*$enragePenalty;
}
}
return $points;
}
When quantifying some things net positive points, some net negative points to the state. What the AI is doing, is, that instead of using the points calculated after his current move to decide which move to take, he uses the points calculated after the think ahead portion, and selecting a move based on the possible moves of the other heroes.
Basically, what the AI is doing, is saying that it isn't the best option at the moment, to attack Monster 1, but IF the other heroes will do this-and-this actions, in the long run, this will be the best outcome.
After selecting a move, the AI performs a single move with the hero, and then repeats the process for the next hero, calculating with +1 moves.
ISSUE: My issue is, that I was presuming, that an AI, that 'thinks ahead' 3-4 moves, should find a better solution than an AI that only performs the best possible move at the moment. But my test cases show differently, in some cases, an AI, that is not using the think ahead option, ie only plays the best possible move at the moment, beats an AI that is thinking ahead 1 single move. Sometimes, the AI that thinks ahead only 3 moves, beats an AI that thinks ahead 4 or 5 moves. Why is this happening? Is my presumption incorrect? If so, why is that? Am I using wrong numbers for weights? I was investigating this, and run a test, to automatically calculate the weights to use, with testing an interval of possible weights, and trying to use the best outcome (ie, the ones, which yield the least number of turns and/or the least number of activations), yet the problem I've described above, still persists with those weights also.
I am limited to a 5 move think ahead with the current version of my script, as with any larger think ahead number, the script gets REALLY slow (with 5 think ahead, it finds a solution in roughly 4 minutes, but with 6 think ahead, it didn't even find the first possible move in 6 hours)
HOW THE FIGHT WORKS: The fight works in the following way: a number of heroes (2-4) controlled by the AI, each having a number of different attacks (1-x), which can be used once or multiple times in a combat, are attacking a number of monsters (1-9). Based on the values of the attack, the monsters lose health, until they die. After each attack, the attacked monster gets enraged if he didn't die, and after each heroes performed a move, all monsters get enraged. When the monsters reach their enrage limit, they activate.
DISCLAIMER: I know that PHP is not the language to use for this kind of operation, but as this is only an in-house project, I've preferred to sacrifice speed, to be able to code this as fast as possible, in my native programming language.
UPDATE: The quantifications that we currently use look something like this:
$Battlefield->setQuantification(array(
'health' => 16,
'monsterKilled' => 86,
'activations' => -46,
'activationsPenalty' => -10
));

If there is randomness in your game, then anything can happen. Pointing that out since it's just not clear from the materials you have posted here.
If there is no randomness and the actors can see the full state of the game, then a longer look-ahead absolutely should perform better. When it does not, it is a clear indication that your evaluation function is providing incorrect estimates of the value of a state.
In looking at your code, the values of your quantifications are not listed and in your simulation it looks like you just have the same player make moves repeatedly without considering the possible actions of the other actors. You need to run a full simulation, step by step in order to produce accurate future states and you need to look at the value estimates of the varying states to see if you agree with them, and make adjustments to your quantifications accordingly.
An alternative way to frame the problem of estimating value is to explicitly predict your chances of winning the round as a percentage on a scale of 0.0 to 1.0 and then choose the move that gives you the highest chance of winning. Calculating the damage done and number of monsters killed so far doesn't tell you much about how much you have left to do in order to win the game.

Related

Counting number of times a movieclip loops

I have a 20 frame bouncing ball movieclip “ballAnim” on frame 1 of the stage, and I simply want to count each time “ballAnim” loops playback.
I am inexperienced with actionscript, and I’ve searched and tried a few things to no avail.
Here’s the code from my latest attempt:
import flash.events.Event;
var loopCounter:int;
ballAnim.addEventListener(Event.ENTER_FRAME, addOneLoop);
function addOneLoop(e:Event){
if(ballAnim.currentFrame == 20)
{loopCounter+1}
}
trace(loopCounter);\`
All I get is a single instance of 0. I’ve searched around and I think the problem is that loopCounter is resetting to 0 on every frame because of ENTER_FRAME? I tried addressing this by adding a 2nd keyframe on my actions timeline layer with stop(); (with the movieclip layer spanning both frames underneath) but it doesn’t help.
I’ve read that I might need to use a class, but I’m not sure what that is, and I thought this would be a fairly straightforward thing.
Ok, let's have a class.
Your problem is not understanding the flow of things. If we put it simply, Flash Player executes the movie/application in the infinite loop of recurring phases:
Playheads of playing MovieClips (also, the main timeline) move to the next frame.
Frame scripts are executed.
The whole movie is rendered and the picture is updated on the screen.
Pause till the next frame.
Events are handled.
Ok, the exact order just MIGHT be different (it is possible to figure it out but not important now). The important part is to understand that:
Flash Player is (normally) not a multi-thread environment, the phases follow each other, they never overlap, only one thing at a time happens ever and we are pretty much able to follow, which one.
The script you provided is executed at the "frame scripts" phase and that the ENTER_FRAME event handler doesn't execute until the "event handling" phase kicks in.
So, let's check it:
import flash.events.Event;
// This one returns time (in milliseconds) passed from the start.
import flash.utils.getTimer;
trace("A", getTimer());
ballAnim.addEventListener(Event.ENTER_FRAME, addOneLoop);
// Let's not be lazy and initialize our variables.
var loopCounter:int = 0;
function addOneLoop(e:Event):void
{
trace("BB", getTimer());
// Check the last frame like that because you can
// change the animation and forget to fix the numbers.
if (ballAnim.currentFrame >= ballAnim.totalFrames)
{
trace("CCC", getTimer());
// Increment operation is +=, not just +.
loopCounter += 1;
}
}
trace("DDDD", getTimer());
trace(loopCounter);
Now once you run it, you will get something like this:
A (small number)
DDDD (small number)
0
BB ...
BB ...
(20 times total of BB)
BB ...
CCC ...
BB ...
BB ...
Thus, in order to trace the number of loops happened, you need to output it inside the handler rather than in the frame script:
import flash.events.Event;
import flash.utils.getTimer;
ballAnim.addEventListener(Event.ENTER_FRAME, addOneLoop);
var loopCounter:int = 0;
function addOneLoop(e:Event):void
{
if (ballAnim.currentFrame >= ballAnim.totalFrames)
{
loopCounter += 1;
// This is the very place to track this counter.
trace("The ball did", loopCounter, "loops in", getTimer(), "milliseconds!");
}
}

Appending values to DataSet in Apache Flink

I am currently writing an (simple) analytisis code to sum time connected powerreadings. With the data being assumingly raw (e.g. disturbances from the measuring device have not been calculated out) I have to account for disturbances by calculation the mean of the first one thousand samples. The calculation of the mean itself is not a problem. I only am unsure of how to generate the appropriate DataSet.
For now it looks about like this:
DataSet<Tupel2<long,double>>Gyrotron_1=ECRH.includeFields('11000000000'); // obviously the line to declare the first gyrotron, continues for the next ten lines, assuming separattion of not occupied space
DataSet<Tupel2<long,double>>Gyrotron_2=ECRH.includeFields('10100000000');
DataSet<Tupel2<long,double>>Gyrotron_3=ECRH.includeFields('10010000000');
DataSet<Tupel2<long,double>>Gyrotron_4=ECRH.includeFields('10001000000');
DataSet<Tupel2<long,double>>Gyrotron_5=ECRH.includeFields('10000100000');
DataSet<Tupel2<long,double>>Gyrotron_6=ECRH.includeFields('10000010000');
DataSet<Tupel2<long,double>>Gyrotron_7=ECRH.includeFields('10000001000');
DataSet<Tupel2<long,double>>Gyrotron_8=ECRH.includeFields('10000000100');
DataSet<Tupel2<long,double>>Gyrotron_9=ECRH.includeFields('10000000010');
DataSet<Tupel2<long,double>>Gyrotron_10=ECRH.includeFields('10000000001');
for (int=1,i<=10;i++) {
DataSet<double> offset=Gyroton_'+i+'.groupBy(1).first(1000).sum()/1000;
}
It's the part in the for-loop I'm unsure of. Does anybody know if it is possible to append values to DataSets and if so how?
In case of doubt, I could always put the values into an array but I do not know if that is the wise thing to do.
This code will not work for many reasons. I'd recommend looking into the fundamentals of Java and the basic data structures and also in Flink.
It's really hard to understand what you actually try to achieve but this is the closest that I came up with
String[] codes = { "11000000000", ..., "10000000001" };
DataSet<Tuple2<Long, Double>> result = env.fromElements();
for (final String code : codes) {
DataSet<Tuple2<Long, Double>> codeResult = ECRH.includeFields(code)
.groupBy(1)
.first(1000)
.sum(0)
.map(sum -> new Tuple2<>(sum.f0, sum.f1 / 1000d));
result = codeResult.union(result);
}
result.print();
But please take the time and understand the basics before delving deeper. I also recommend to use an IDE like IntelliJ that would point to at least 6 issues in your code.

How to update weights when using mini batches?

I am trying to implement mini batch training to my neural network instead of the "online" stochastic method of updating weights every training sample.
I have developed a somewhat novice neural network in C whereby i can adjust the number of neurons in each layer , activation functions etc. This is to help me understand neural networks. I have trained the network on mnist data set but it takes around 200 epochs to get down do an error rate of 20% on the training set which seams very poor to me. I am currently using online stochastic gradient decent to train the network. What i would like to try is use mini batches instead. I understand the concept that i must accumulate and average the error from each training sample before i propagate the error back. My problem comes in when i want to calculate the changes i must make to the weights. To explain this better consider a very simple perceptron model. One input, one hidden layer one output. To calculate the change i need to make to the weight between the input and the hidden unit i will use this following equation:
∂C/∂w1= ∂C/∂O*∂O/∂h*∂h/∂w1
If you do the partial derivatives you get:
∂C/∂w1= (Output-Expected Answer)(w2)(input)
Now this formula says that you need to multiply the back propogated error by the input. For online stochastic training that makes sense because you use 1 input per weight update. For minibatch training you used many inputs so which input does the error get multiplied by?
I hope you can assist me with this.
void propogateBack(void){
//calculate 6C/6G
for (count=0;count<network.outputs;count++){
network.g_error[count] = derive_cost((training.answer[training_current])-(network.g[count]));
}
//calculate 6G/6O
for (count=0;count<network.outputs;count++){
network.o_error[count] = derive_activation(network.g[count])*(network.g_error[count]);
}
//calculate 6O/6S3
for (count=0;count<network.h3_neurons;count++){
network.s3_error[count] = 0;
for (count2=0;count2<network.outputs;count2++){
network.s3_error[count] += (network.w4[count2][count])*(network.o_error[count2]);
}
}
//calculate 6S3/6H3
for (count=0;count<network.h3_neurons;count++){
network.h3_error[count] = (derive_activation(network.s3[count]))*(network.s3_error[count]);
}
//calculate 6H3/6S2
network.s2_error[count] = = 0;
for (count=0;count<network.h2_neurons;count++){
for (count2=0;count2<network.h3_neurons;count2++){
network.s2_error[count] = += (network.w3[count2][count])*(network.h3_error[count2]);
}
}
//calculate 6S2/6H2
for (count=0;count<network.h2_neurons;count++){
network.h2_error[count] = (derive_activation(network.s2[count]))*(network.s2_error[count]);
}
//calculate 6H2/6S1
network.s1_error[count] = 0;
for (count=0;count<network.h1_neurons;count++){
for (count2=0;count2<network.h2_neurons;count2++){
buffer += (network.w2[count2][count])*network.h2_error[count2];
}
}
//calculate 6S1/6H1
for (count=0;count<network.h1_neurons;count++){
network.h1_error[count] = (derive_activation(network.s1[count]))*(network.s1_error[count]);
}
}
void updateWeights(void){
//////////////////w1
for(count=0;count<network.h1_neurons;count++){
for(count2=0;count2<network.inputs;count2++){
network.w1[count][count2] -= learning_rate*(network.h1_error[count]*network.input[count2]);
}
}
//////////////////w2
for(count=0;count<network.h2_neurons;count++){
for(count2=0;count2<network.h1_neurons;count2++){
network.w2[count][count2] -= learning_rate*(network.h2_error[count]*network.s1[count2]);
}
}
//////////////////w3
for(count=0;count<network.h3_neurons;count++){
for(count2=0;count2<network.h2_neurons;count2++){
network.w3[count][count2] -= learning_rate*(network.h3_error[count]*network.s2[count2]);
}
}
//////////////////w4
for(count=0;count<network.outputs;count++){
for(count2=0;count2<network.h3_neurons;count2++){
network.w4[count][count2] -= learning_rate*(network.o_error[count]*network.s3[count2]);
}
}
}
The code i have attached is how i do the online stochastic updates. As you can see in the updateWeights() function the weight updates are based on the input values (dependent on the sample fed in) and the hidden unit values (also dependent on the input sample value fed in). So when i have the minibatch average gradient that i am propogating back how will i update the weights? which input values do i use?
Ok so i figured it out. When using mini batches you should not accumulate and average out the error at the output of the network. Each training examples error gets propogated back as you would normally except instead of updating the weights you accumulate the changes you would have made to each weight. When you have looped through the mini batch you then average the accumulations and change the weights accordingly.
I was under the impression that when using mini batches you do not have to propogate any error back until you have looped through the mini batch. I was wrong you still need to do that the only difference is you only update the weights once you have looped through your mini batch size.
For minibatch training you used many inputs so which input does the error get multiplied by?
"Many inputs" this is a proportion of the dataset size N, which typically segments your data into sizes which are not too large to fit into memory. DL needs Big Data and the full batch cannot fit into most computer systems to process in one go and therefore the mini-batch is necessary.
The error which gets backpropagated is the sum or average error calculated for the data samples in your current mini-batch $X^{{t}}$ which is of size M where $M<N$, $J^{{t}} = 1/m \sum_1^M ( f(x_m^{t})-y_m^{t} )^2$. This is the sum of the squared distances to the target across samples in the batch 't'. This is the forward step and then the backwards propagation of this error is made using the chain rule through the 'neurons' of the network; using this single value of the error for the whole batch. The update of the parameters is based upon this value for this mini-batch.
There are variations to how this scheme is implemented but if you consider your idea of using "many inputs" in the calculation of the parameter update using multiple input samples from the batch, we are averaging over multiple gradients to smooth over the gradient in comparison to stochastic gradient descent.

How to make a function wait X amount of time in LUA (Love2d)?

I am very new to programming and coming from a "custom map" background in games like SC2. I am currently trying to make a platformer game in Love2d. But I wonder how I can make something wait X amount of seconds before doing the next thing.
Say I want to make the protagonist immortal for 5 seconds, how should that code look like ?
Immortal = true
????????????????
Immortal = false
As I have understood there is no built in wait in Lua nor Love2d.
It sounds like you're interested in a temporary state for one of your game entities. This is pretty common - a powerup lasts for six seconds, an enemy is stunned for two seconds, your character looks different while jumping, etc. A temporary state is different than waiting. Waiting suggests that absolutely nothing else happens during your five seconds of immortality. It sounds like you want the game to continue as normal, but with an immortal protagonist for five seconds.
Consider using a "time remaining" variable versus a boolean to represent temporary states. For example:
local protagonist = {
-- This is the amount of immortality remaining, in seconds
immortalityRemaining = 0,
health = 100
}
-- Then, imagine grabbing an immortality powerup somewhere in the game.
-- Simply set immortalityRemaining to the desired length of immortality.
function protagonistGrabbedImmortalityPowerup()
protagonist.immortalityRemaining = 5
end
-- You then shave off a little bit of the remaining time during each love.update
-- Remember, dt is the time passed since the last update.
function love.update(dt)
protagonist.immortalityRemaining = protagonist.immortalityRemaining - dt
end
-- When resolving damage to your protagonist, consider immortalityRemaining
function applyDamageToProtagonist(damage)
if protagonist.immortalityRemaining <= 0 then
protagonist.health = protagonist.health - damage
end
end
Be careful with concepts like wait and timer. They typically refer to managing threads. In a game with many moving parts, it's often easier and more predictable to manage things without threads. When possible, treat your game like a giant state machine versus synchronizing work between threads. If threads are absolutely necessary, Löve does offer them in its love.thread module.
I normally use cron.lua for what you're talking about: https://github.com/kikito/cron.lua
Immortal = true
immortalTimer = cron.after(5, function()
Immortal = false
end)
and then just stick immortalTimer:update(dt) in your love.update.
You could do this:
function delay_s(delay)
delay = delay or 1
local time_to = os.time() + delay
while os.time() < time_to do end
end
Then you can just do:
Immortal == true
delay_s(5)
Immortal == false
Of course, it'll prevent you from doing anything else unless you run it in its own thread. But this is strictly Lua as I know nothing of Love2d, unfortunately.
I reccomend that you use hump.timer in your game,like this:
function love.load()
timer=require'hump.timer'
Immortal=true
timer.after(1,function()
Immortal=false
end)
end
instead of using timer.after,you can also use timer.script,like this:
function love.load
timer=require'hump.timer'
timer.script(function(wait)
Immortal=true
wait(5)
Immortal=false
end)
end
don't forget to add timer.updateinto function love.update!
function love.update(dt)
timer.update(dt)
end
hope this helped ;)
Download link:https://github.com/vrld/hump

C++ path finding in a 2d array

I have been struggling badly with this challenge my lecturer has provided. I have programmed the files that set up the class needed for this solution but I have no idea how to implement it, here is the class in question were I need to add the algorithm.
#include "Solver.h"
int* Solver::findNumPaths(const MazeCollection& mazeCollection)
{
int *numPaths = new int[mazeCollection.NUM_MAZES];
return numPaths;
}
and here is the problem description we have been provided. does anybody know how to implement this or set me on the right track, Thank you!
00C, we need your help again.
Angry with being thwarted, the diabolically evil mastermind Dr Russello Kane has unleashed a scurry of heavy-armed squirrels to attack the BCB and eliminate all the delightfully beautiful and intellectual superior computing students.
We need to respond to this threat at short notice and have plans to partially barricade the foyer of the BCB. The gun-toting squirrels will enter the BCB at square [1,1] and rush towards the exit shown at [10,10].
A square that is barricaded is impassable to the furry rodents. Importantly, the squirrel bloodlust is such that they will only ever move towards the exit – either moving one square to the right, or one square down. The squirrels will never move up or to the left, even if a barricade is blocking their approach.
Our boffins need to run a large number of tests to determine how barricade placement will impede the movement of the squirrels. In each test, a number of squares will be barricaded and you must determine the total number of different paths from the start to the exit (adhering to the squirrel movement patterns noted above).
A number of our boffins have been heard to mumble something incoherent about a recursive counting algorithm, others about the linkage between recursion and iteration, but I’m sure, OOC, you know better than to be distracted by misleading advice.
Start w/ the obvious:
int count = 0;
void countPaths( x, y ) {
if ( x==10 && y==10 ) {
count++;
return;
}
if ( can-move-right )
countPaths( x+1, y );
if ( can-mopve-down )
countPaths( x, y+1 );
}
Start by calling countPaths(0,0).
Not the most efficient by a long shot, but it'll work. Then look for ways to optimize (for example, you end up re-computing paths from the squares close to the goal a LOT -- reducing that work could make a big difference).

Resources