How to update weights when using mini batches? - c

I am trying to implement mini batch training to my neural network instead of the "online" stochastic method of updating weights every training sample.
I have developed a somewhat novice neural network in C whereby i can adjust the number of neurons in each layer , activation functions etc. This is to help me understand neural networks. I have trained the network on mnist data set but it takes around 200 epochs to get down do an error rate of 20% on the training set which seams very poor to me. I am currently using online stochastic gradient decent to train the network. What i would like to try is use mini batches instead. I understand the concept that i must accumulate and average the error from each training sample before i propagate the error back. My problem comes in when i want to calculate the changes i must make to the weights. To explain this better consider a very simple perceptron model. One input, one hidden layer one output. To calculate the change i need to make to the weight between the input and the hidden unit i will use this following equation:
∂C/∂w1= ∂C/∂O*∂O/∂h*∂h/∂w1
If you do the partial derivatives you get:
∂C/∂w1= (Output-Expected Answer)(w2)(input)
Now this formula says that you need to multiply the back propogated error by the input. For online stochastic training that makes sense because you use 1 input per weight update. For minibatch training you used many inputs so which input does the error get multiplied by?
I hope you can assist me with this.
void propogateBack(void){
//calculate 6C/6G
for (count=0;count<network.outputs;count++){
network.g_error[count] = derive_cost((training.answer[training_current])-(network.g[count]));
}
//calculate 6G/6O
for (count=0;count<network.outputs;count++){
network.o_error[count] = derive_activation(network.g[count])*(network.g_error[count]);
}
//calculate 6O/6S3
for (count=0;count<network.h3_neurons;count++){
network.s3_error[count] = 0;
for (count2=0;count2<network.outputs;count2++){
network.s3_error[count] += (network.w4[count2][count])*(network.o_error[count2]);
}
}
//calculate 6S3/6H3
for (count=0;count<network.h3_neurons;count++){
network.h3_error[count] = (derive_activation(network.s3[count]))*(network.s3_error[count]);
}
//calculate 6H3/6S2
network.s2_error[count] = = 0;
for (count=0;count<network.h2_neurons;count++){
for (count2=0;count2<network.h3_neurons;count2++){
network.s2_error[count] = += (network.w3[count2][count])*(network.h3_error[count2]);
}
}
//calculate 6S2/6H2
for (count=0;count<network.h2_neurons;count++){
network.h2_error[count] = (derive_activation(network.s2[count]))*(network.s2_error[count]);
}
//calculate 6H2/6S1
network.s1_error[count] = 0;
for (count=0;count<network.h1_neurons;count++){
for (count2=0;count2<network.h2_neurons;count2++){
buffer += (network.w2[count2][count])*network.h2_error[count2];
}
}
//calculate 6S1/6H1
for (count=0;count<network.h1_neurons;count++){
network.h1_error[count] = (derive_activation(network.s1[count]))*(network.s1_error[count]);
}
}
void updateWeights(void){
//////////////////w1
for(count=0;count<network.h1_neurons;count++){
for(count2=0;count2<network.inputs;count2++){
network.w1[count][count2] -= learning_rate*(network.h1_error[count]*network.input[count2]);
}
}
//////////////////w2
for(count=0;count<network.h2_neurons;count++){
for(count2=0;count2<network.h1_neurons;count2++){
network.w2[count][count2] -= learning_rate*(network.h2_error[count]*network.s1[count2]);
}
}
//////////////////w3
for(count=0;count<network.h3_neurons;count++){
for(count2=0;count2<network.h2_neurons;count2++){
network.w3[count][count2] -= learning_rate*(network.h3_error[count]*network.s2[count2]);
}
}
//////////////////w4
for(count=0;count<network.outputs;count++){
for(count2=0;count2<network.h3_neurons;count2++){
network.w4[count][count2] -= learning_rate*(network.o_error[count]*network.s3[count2]);
}
}
}
The code i have attached is how i do the online stochastic updates. As you can see in the updateWeights() function the weight updates are based on the input values (dependent on the sample fed in) and the hidden unit values (also dependent on the input sample value fed in). So when i have the minibatch average gradient that i am propogating back how will i update the weights? which input values do i use?

Ok so i figured it out. When using mini batches you should not accumulate and average out the error at the output of the network. Each training examples error gets propogated back as you would normally except instead of updating the weights you accumulate the changes you would have made to each weight. When you have looped through the mini batch you then average the accumulations and change the weights accordingly.
I was under the impression that when using mini batches you do not have to propogate any error back until you have looped through the mini batch. I was wrong you still need to do that the only difference is you only update the weights once you have looped through your mini batch size.

For minibatch training you used many inputs so which input does the error get multiplied by?
"Many inputs" this is a proportion of the dataset size N, which typically segments your data into sizes which are not too large to fit into memory. DL needs Big Data and the full batch cannot fit into most computer systems to process in one go and therefore the mini-batch is necessary.
The error which gets backpropagated is the sum or average error calculated for the data samples in your current mini-batch $X^{{t}}$ which is of size M where $M<N$, $J^{{t}} = 1/m \sum_1^M ( f(x_m^{t})-y_m^{t} )^2$. This is the sum of the squared distances to the target across samples in the batch 't'. This is the forward step and then the backwards propagation of this error is made using the chain rule through the 'neurons' of the network; using this single value of the error for the whole batch. The update of the parameters is based upon this value for this mini-batch.
There are variations to how this scheme is implemented but if you consider your idea of using "many inputs" in the calculation of the parameter update using multiple input samples from the batch, we are averaging over multiple gradients to smooth over the gradient in comparison to stochastic gradient descent.

Related

Appending values to DataSet in Apache Flink

I am currently writing an (simple) analytisis code to sum time connected powerreadings. With the data being assumingly raw (e.g. disturbances from the measuring device have not been calculated out) I have to account for disturbances by calculation the mean of the first one thousand samples. The calculation of the mean itself is not a problem. I only am unsure of how to generate the appropriate DataSet.
For now it looks about like this:
DataSet<Tupel2<long,double>>Gyrotron_1=ECRH.includeFields('11000000000'); // obviously the line to declare the first gyrotron, continues for the next ten lines, assuming separattion of not occupied space
DataSet<Tupel2<long,double>>Gyrotron_2=ECRH.includeFields('10100000000');
DataSet<Tupel2<long,double>>Gyrotron_3=ECRH.includeFields('10010000000');
DataSet<Tupel2<long,double>>Gyrotron_4=ECRH.includeFields('10001000000');
DataSet<Tupel2<long,double>>Gyrotron_5=ECRH.includeFields('10000100000');
DataSet<Tupel2<long,double>>Gyrotron_6=ECRH.includeFields('10000010000');
DataSet<Tupel2<long,double>>Gyrotron_7=ECRH.includeFields('10000001000');
DataSet<Tupel2<long,double>>Gyrotron_8=ECRH.includeFields('10000000100');
DataSet<Tupel2<long,double>>Gyrotron_9=ECRH.includeFields('10000000010');
DataSet<Tupel2<long,double>>Gyrotron_10=ECRH.includeFields('10000000001');
for (int=1,i<=10;i++) {
DataSet<double> offset=Gyroton_'+i+'.groupBy(1).first(1000).sum()/1000;
}
It's the part in the for-loop I'm unsure of. Does anybody know if it is possible to append values to DataSets and if so how?
In case of doubt, I could always put the values into an array but I do not know if that is the wise thing to do.
This code will not work for many reasons. I'd recommend looking into the fundamentals of Java and the basic data structures and also in Flink.
It's really hard to understand what you actually try to achieve but this is the closest that I came up with
String[] codes = { "11000000000", ..., "10000000001" };
DataSet<Tuple2<Long, Double>> result = env.fromElements();
for (final String code : codes) {
DataSet<Tuple2<Long, Double>> codeResult = ECRH.includeFields(code)
.groupBy(1)
.first(1000)
.sum(0)
.map(sum -> new Tuple2<>(sum.f0, sum.f1 / 1000d));
result = codeResult.union(result);
}
result.print();
But please take the time and understand the basics before delving deeper. I also recommend to use an IDE like IntelliJ that would point to at least 6 issues in your code.

AzureMap : Real Time Alert pop up

I am new to AzureMap with limited knowledge of JavaScript and looking help in getting the real time alert based on some random flag during the fleet movement based on the co ordinates .
I tried multiple sources to design it like followed Sample code enter link description here
My requirement is :
Pip should pop up or appear only on arrival of the fleet (truck) .
Thanks
To verify, when the truck gets to the end of the line you want to open a popup. I'm assuming you have a constant flow of data updating the truck position and that you can easily grab the trucks coordinate. As such you would only need a function to determine if the truck coordinate is at the end of the route line. You will likely need to account for a margin of error (i.e. within 15 meters of the end of the line) as a single coordinate can represent a single molecule with enough decimal places and GPS devices typically have an accuracy of +/- 15 meters. With this in mind all you would need to do is calculate the distance from the truck coordinate to the last coordinate of the route line. For example:
var lastRouteCoord = [-110, 45];
var truckCoord = [-110.0001, 45.0001];
var minDistance = 15;
//Get the distance between the coordinates (by default this function returns a distance in meters).
var distance = atlas.math.getDistanceTo(lastRouteCoord, truckCoord);
if(distance <= minDistance){
//Open popup.
//Examples: https://azuremapscodesamples.azurewebsites.net/index.html#Popups
}

What am I doing wrong with this AI?

I am creating a very naive AI (it maybe shouldn't even be called an AI, as it just tests out a lot of possibilites and picks the best one for him), for a board game I am making. This is to simplify the amount of manual tests I will need to do to balance the game.
The AI is playing alone, doing the following things: in each turn, the AI, playing with one of the heroes, attacks one of the max 9 monsters on the battlefield. His goal is to finish the battle as fast as possible (in the least amount of turns) and with the fewest amount of monster activations.
To achieve this, I've implemented a think ahead algorithm for the AI, where instead of performing the best possible move at the moment, he selects a move, based on the possible outcome of future moves of other heroes. This is the code snippet where he does this, it is written in PHP:
/** Perform think ahead moves
*
* #params int $thinkAheadLeft (the number of think ahead moves left)
* #params int $innerIterator (the iterator for the move)
* #params array $performedMoves (the moves performed so far)
* #param Battlefield $originalBattlefield (the previous state of the Battlefield)
*/
public function performThinkAheadMoves($thinkAheadLeft, $innerIterator, $performedMoves, $originalBattlefield, $tabs) {
if ($thinkAheadLeft == 0) return $this->quantify($originalBattlefield);
$nextThinkAhead = $thinkAheadLeft-1;
$moves = $this->getPossibleHeroMoves($innerIterator, $performedMoves);
$Hero = $this->getHero($innerIterator);
$innerIterator++;
$nextInnerIterator = $innerIterator;
foreach ($moves as $moveid => $move) {
$performedUpFar = $performedMoves;
$performedUpFar[] = $move;
$attack = $Hero->getAttack($move['attackid']);
$monsters = array();
foreach ($move['targets'] as $monsterid) $monsters[] = $originalBattlefield->getMonster($monsterid)->getName();
if (self::$debug) echo $tabs . "Testing sub move of " . $Hero->Name. ": $moveid of " . count($moves) . " (Think Ahead: $thinkAheadLeft | InnerIterator: $innerIterator)\n";
$moves[$moveid]['battlefield']['after']->performMove($move);
if (!$moves[$moveid]['battlefield']['after']->isBattleFinished()) {
if ($innerIterator == count($this->Heroes)) {
$moves[$moveid]['battlefield']['after']->performCleanup();
$nextInnerIterator = 0;
}
$moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->performThinkAheadMoves($nextThinkAhead, $nextInnerIterator, $performedUpFar, $originalBattlefield, $tabs."\t", $numberOfCombinations);
} else $moves[$moveid]['quantify'] = $moves[$moveid]['battlefield']['after']->quantify($originalBattlefield);
}
usort($moves, function($a, $b) {
if ($a['quantify'] === $b['quantify']) return 0;
else return ($a['quantify'] > $b['quantify']) ? -1 : 1;
});
return $moves[0]['quantify'];
}
What this does is that it recursively checks future moves, until the $thinkAheadleft value is reached, OR until a solution was found (ie, all monsters were defeated). When it reaches it's exit parameter, it calculates the state of the battlefield, compared to the $originalBattlefield (the battlefield state before the first move). The calculation is made in the following way:
/** Quantify the current state of the battlefield
*
* #param Battlefield $originalBattlefield (the original battlefield)
*
* returns int (returns an integer with the battlefield quantification)
*/
public function quantify(Battlefield $originalBattlefield) {
$points = 0;
foreach ($originalBattlefield->Monsters as $originalMonsterId => $OriginalMonster) {
$CurrentMonster = $this->getMonster($originalMonsterId);
$monsterActivated = $CurrentMonster->getActivations() - $OriginalMonster->getActivations();
$points+=$monsterActivated*($this->quantifications['activations'] + $this->quantifications['activationsPenalty']);
if ($CurrentMonster->isDead()) $points+=$this->quantifications['monsterKilled']*$CurrentMonster->Priority;
else {
$enragePenalty = floor($this->quantifications['activations'] * (($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left'])/$CurrentMonster->Enrage['max']));
$points+=($OriginalMonster->Health['left'] - $CurrentMonster->Health['left']) * $this->quantifications['health'];
$points+=(($CurrentMonster->Enrage['max'] - $CurrentMonster->Enrage['left']))*$enragePenalty;
}
}
return $points;
}
When quantifying some things net positive points, some net negative points to the state. What the AI is doing, is, that instead of using the points calculated after his current move to decide which move to take, he uses the points calculated after the think ahead portion, and selecting a move based on the possible moves of the other heroes.
Basically, what the AI is doing, is saying that it isn't the best option at the moment, to attack Monster 1, but IF the other heroes will do this-and-this actions, in the long run, this will be the best outcome.
After selecting a move, the AI performs a single move with the hero, and then repeats the process for the next hero, calculating with +1 moves.
ISSUE: My issue is, that I was presuming, that an AI, that 'thinks ahead' 3-4 moves, should find a better solution than an AI that only performs the best possible move at the moment. But my test cases show differently, in some cases, an AI, that is not using the think ahead option, ie only plays the best possible move at the moment, beats an AI that is thinking ahead 1 single move. Sometimes, the AI that thinks ahead only 3 moves, beats an AI that thinks ahead 4 or 5 moves. Why is this happening? Is my presumption incorrect? If so, why is that? Am I using wrong numbers for weights? I was investigating this, and run a test, to automatically calculate the weights to use, with testing an interval of possible weights, and trying to use the best outcome (ie, the ones, which yield the least number of turns and/or the least number of activations), yet the problem I've described above, still persists with those weights also.
I am limited to a 5 move think ahead with the current version of my script, as with any larger think ahead number, the script gets REALLY slow (with 5 think ahead, it finds a solution in roughly 4 minutes, but with 6 think ahead, it didn't even find the first possible move in 6 hours)
HOW THE FIGHT WORKS: The fight works in the following way: a number of heroes (2-4) controlled by the AI, each having a number of different attacks (1-x), which can be used once or multiple times in a combat, are attacking a number of monsters (1-9). Based on the values of the attack, the monsters lose health, until they die. After each attack, the attacked monster gets enraged if he didn't die, and after each heroes performed a move, all monsters get enraged. When the monsters reach their enrage limit, they activate.
DISCLAIMER: I know that PHP is not the language to use for this kind of operation, but as this is only an in-house project, I've preferred to sacrifice speed, to be able to code this as fast as possible, in my native programming language.
UPDATE: The quantifications that we currently use look something like this:
$Battlefield->setQuantification(array(
'health' => 16,
'monsterKilled' => 86,
'activations' => -46,
'activationsPenalty' => -10
));
If there is randomness in your game, then anything can happen. Pointing that out since it's just not clear from the materials you have posted here.
If there is no randomness and the actors can see the full state of the game, then a longer look-ahead absolutely should perform better. When it does not, it is a clear indication that your evaluation function is providing incorrect estimates of the value of a state.
In looking at your code, the values of your quantifications are not listed and in your simulation it looks like you just have the same player make moves repeatedly without considering the possible actions of the other actors. You need to run a full simulation, step by step in order to produce accurate future states and you need to look at the value estimates of the varying states to see if you agree with them, and make adjustments to your quantifications accordingly.
An alternative way to frame the problem of estimating value is to explicitly predict your chances of winning the round as a percentage on a scale of 0.0 to 1.0 and then choose the move that gives you the highest chance of winning. Calculating the damage done and number of monsters killed so far doesn't tell you much about how much you have left to do in order to win the game.

Filtering "Smoothing" an array of numbers in C

I am writing an application in X-code. It is gathering the sensor data (gyroscope) and then transforming it throw FFTW. At the end I am getting the result in an array. In the app. I am plotting the graph but there is so much peaks (see the graph in red) and i would like to smooth it.
My array:
double magnitude[S];
...
magnitude[i]=sqrt((fft_result[i][0])*(fft_result[i][0])+ (fft_result[i][1])*(fft_result[i][1]) );
An example array (for 30 samples, normally I am working with 256 samples):
"0.9261901713034604",
"2.436272348237486",
"1.618854900218465",
"1.849221286218342",
"0.8495016887742839",
"0.5716796354304043",
"0.4229791869017677",
"0.3731843430827401",
"0.3254446111798023",
"0.2542702545675339",
"0.25237940627189",
"0.2273716541964159",
"0.2012780334451323",
"0.2116151847259499",
"0.1921943719520009",
"0.1982429400169304",
"0.18001770452247",
"0.1982429400169304",
"0.1921943719520009",
"0.2116151847259499",
"0.2012780334451323",
"0.2273716541964159",
"0.25237940627189",
"0.2542702545675339",
"0.3254446111798023",
"0.3731843430827401",
"0.4229791869017677",
"0.5716796354304043",
"0.8495016887742839",
"1.849221286218342"
How to filter /smooth it? whats about gauss? Any idea how to begin or even giving me a sample code.
Thank you for your help!
best regards
josef
Simplest way to smooth would be to replace each sample with the average of it and its 2 neighbors.
The simpliest idea would be taking average of 2 points and putting them into an array. Something like
double smooth_array[S];
for (i = 0; i<S-2; i++)
smooth_array[i]=(magnitude[i] + magnitude[i+1])/2;
smooth_array[S-1]=magnitude[S-1];
It is not best one, but I think it should be ok.
If you need the scientific approach - use some kind of approximation / approximation algorithms. Something like least squares function approximation or even full SE13/SE35 etc. algorithms.

Programmatically detect and extract an audio envelope

All suggestions and links to relevant info welcome here. This is the scenario:
Let us say I have a .wav file of someone speaking (and therefore all the samples associated with it).
I would like to run an algorithm on the series of samples to detect when an event happens i.e. the beginning and the end of an envelope. I would then use this starting and end point to extract that data to be used elsewhere.
What would be the best way to tackle this? Any pseudocode? Example code? Source code?
I will eventually be writing this in C.
Thanks!
EDIT 1
Parsing the wav file is not a problem. But some pseudo-code for the envelope detection would be nice! :)
The usual method is:
take absolute value of waveform, abs(x[t])
low pass filter (say 10 Hz cut-off)
apply threshold
You could use the same method as an old fashioned analog meter. Rectify the sample vector, pass the absolute value result though a low pass filter (FIR, IIR, moving average, etc.), than compare against some threshold. For a more accurate event time, you will have to subtract the group delay time of the low pass filter.
Added: You might also need to remove DC beforehand (say with a high-pass filter or other DC blocker equivalent to capacitive coupling).
Source code of simple envelope detectors can be found in the Music-DSP Source Code Archive.
I have written an activity detector class in Java. It's part of my open-source Java DSP collection.
first order low pass filter C# Code:
double old_y = 0;
double R1Filter(double x, double rct)
{
if (rct == 0.0)
return 0;
if (x > old_y)
old_y = old_y-(old_y - x)*rct/256;
else
old_y = old_y + (x - old_y) * rct/256;
return old_y;
}
When rct=2, it works like this:
The signal = (ucm + ucm * ma * Cos(big_omega * x)) * (Cos(small_omega1 * x) + Cos(small_omega2 * x) )
where ucm=3,big_omega=200,small_omega1=4,small_omega2=12 and ma=0.8
Pay attention that the filter may change the phase of the base band signal.

Resources