I am currently learning pattern recognition. I have a 7 year background in programming, so, I think like a programmer.
The documentation on ANN's tell me nothing about what order everything is processed, or at least does not make it very clear. This is annoying as I don't know how to code the formulas.
I found a nice gif which I hope is correct. Can someone please give me a step by step process of a artificial neural network back propagation with for example 2 inputs, 1 hidden layer with 3 nodes, 2 outputs using the sigmoid.
Here is the gif.
As Emile said you go layer by layer from input to output and then you propagate error backwards again layer by layer.
From what you have said I expect that you are trying to make "object oriented" implementation where every neuron is object. But that is not exactly the fastest nor easiest way. The most usual implementation is done by Matrix operations where
every layer is described by single Matrix (every row contains weights of one neuron plus threshold)
this is matlab code should do the trick:
output_hidden = logsig( hidden_layer * [inputs ; 1] );
inputs is column vector of inputs to layer
hidden_layer is matrix of weights plus one row which describes thresholds in hidden layer
output_hidden is again column vector of outputs of all neurons in layer which can be used as input into next layer
logsig is function which do sigmoid transform on all members of vector one by one
[inputs ; 1] creates new vector with 1 at the end of column vector inputs it is here because you need "virtual input" for thresholds to be multiplied with.
if you will think about it you will see that matrix multiplication will do exactly summation over all inputs multiplied by weight to output, you will also see that it doesn't matter in what order you do all the things. in order to implement it in any other language just find yourself good linear-algebra library. Implementing back-propagation is a bit trickier and you will need to tho some matrix transpositions (e.g. flipping matrix by diagonal)
As you can see in the gif, processing is per layer. As there are no connections within a layer, the processing order within a layer does not matter. Using the ANN (classifying) is done from input layer through hidden layers to the output layer. Training (using backpropagation) is done from output layer back to input layer.
I am building a back-propagation neural network (with Encog library) that makes decisions depending on about 50 factor, I want help with its best design : We need 50 input neuron for sure, and 4 neurons as output to give the answer ( 4 digits) but I am not sure about the number of neurons in the hidden layer, how many is the best?
also I want to ask if back-propagation with segmoid activation function is the best for this kind of situations.
Thanks for advance.
Use a single hidden layer. A neural network with only one hidden layer has been shown to be a universal approximation. See: http://en.wikipedia.org/wiki/Universal_approximation_theorem
As to the number of hidden neurons. You will need to experiment. I would start with 50, and try 25-75.
Sigmoid is fine, if your expected output is between 0 and 1. You can also try tanh if you normalize to -1 to 1.
I have a simple task to classify people by their height and hair length to either MAN or WOMAN category using a neural network. Also teach it the pattern with some examples and then use it to classify on its own.
I have a basic understanding of neural networks but would really need some help here.
I know that each neuron divides the area to two subareas, basically that is why P = w0 + w1*x1 + w2*x2 + ... + wn*xn is being used here (weights are just moving the line if we consider geometric representation).
I do understand that each epoche should modify the weights to get closer to correct result, yet I have never program it and I am hopeless about how to start.
How should I proceed, meaning: How can I determine the threshold and how should I deal with the inputs?
It is not a homework rather than task for the ones who were interested. I am and I would like to understand it.
Looks like you are dealing with a simple Perceptron with a threshold activation function. Have a look at this question. Since you ARE using a bias neuron (w0), you would set the threshold to 0.
You then simply take the output of your network and compare it to 0, so you would e.g. output class 1 if x < 0 and class 2 if x > 0. You could model the case x=0 as "indistinct".
For learning the weights you need to apply the Delta Learning Rule which can be implemented very easily. But be careful: a perceptron with a simple threshold activation function can only be correct if your data are linearly separable. If you have more complex data you will need a Multilayer Perceptron and a nonlinear activation function like the Logistic Sigmoid Function.
Have a look at Geoffrey Hintons Coursera Course, Lecture 2 for details.
I've been working with machine learning lately (but I'm not an expert) but you should look at the Accord.NET framework. It contains all the common machine learning algorithme out of the box. So it's easy to take an existing samples and modify it instead of starting from scratch. Also, the developper of the framework is very helpful in the forum available on the same page.
With the available samples, you may also discover something better than neural network like the Kernel Support Vector Machine. If you stick to the neural network, have fun modifying all the different variables and by tryout and error you will understand how it work.
Have fun!
Since you said:
I know that each neuron divides the area to two subareas
&
weights are just moving the line if we consider geometric representation
I think you want to use perseptron or ADALINE neural networks. These neural networks can just classify linear separable patterns. since your input data is complicated, It's better to use a Multi layer Non-Linear Neural network. (my suggestion is a two layer neural network with tanh activation function) . For training these network you should use back propagation algorithm.
For answering to
how should I deal with the inputs?
I need to know more details about the inputs( Like: are they just height and hair length or there is more, what is their range and your resolution and etc.)
If you're dealing with just height and hair length I suggest that divide heights and length in some classes (for example 160cm-165cm, 165cm-170cm & etc.) and for each one of these classes set an On/Off input neuron. then put a hidden layer after all classes related to heights and another hidden layer after all classes related to hair length (tanh activation function). Number of neurons in these two hidden layer is determined based on number of training cases.
then take these two hidden layer output and send them to an aggregation layer with 1 output neuron.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm studying programming on my own and I would like to have an idea how to solve this problem.
I have been given the set of resistors with given resistances and a given value restot. I can pick a given number of those resistors. How can I make a circuit which resistance is as near as possible to restot? A programmer told me that that one can use genetic algorithms but I'm not limited to use such.
I guess I have to make a linear system of equations using Kirchoff's laws to make equations but as I don't have very much experience on electricity problems nor numerical algorithms to linear systems so I would like to have some guidance about how can I make those equations automatically to computers memory as the system changes all the time. And how can I make sure that the algorithm converges to a better solutions?
The problem is from a Finnish discussion forum.
Resistors can either exist in series or in parallel, and their resistances add up differently (add values for series, add reciprocals for parallel).
You can also have networks for resistors in series and parallel.
This sounds to me like a classic case of a recursive data structure, and you could probably represent it as a tree, in a similar way to a binary expression tree: http://en.wikipedia.org/wiki/Binary_expression_tree
Combine that some exploratory tree building (you should look into the way Prolog does this) and you can find the best combination of resistors that gets close to your total.
No genetic algorithms in this approach, although you could take a genetic approach to building and refining the tree.
To apply a genetic algorithm you would need to find a way to represent, mutate and combine the "DNA" of a resistor network.
One way would be to:
Add some number of 0 ohm resistors to your resister set (representing wires).
Number the resistors from 1 to N
For some M, imagine a set of M junctions including the source (1) and sink (M).
You could define which junctions the two endpoints of each resistor are connected to as the unique identifier of a network. This is just an N-tuple of integer pairs in the range 1..M. This tuple can be the "DNA".
Then:
Generate a bunch of random networks from random tuples.
Calculate each networks resistence
Discard some amount of the population farthest from the target resistence.
Combine random pairs of them to form new networks. (perhaps by randomly selecting each resistor endpoint from either parent A or parent B with 50% probability)
Randomly change a few endpoints (mutation).
Goto 2
Not sure if it would actually work exactly like this, but you get the general idea.
There is undoubtably a better non-genetic algorithm, but you specifically asked for a genetic one so there you go.
If you are not limited to genetic algorithm, then I think you can also solve this problem with help of linear programming. You can encode the problem as below and ask a solver to give the answer for you.
Required Resistance Of Circuit = x ohms
// We want to have total 33 resistors.
selected_in_series_1 + selected_in_series_2 +... + selected_in_series_211 + selected_in_parallel_1 + selected_in_parallel_2 + ... + selected_in_parallel_211 = 33
// Resistor in Series
(selected_in_series_1 * Resistor_1) + (selected_in_series_2 * Resistor_2) + ..(selected_in_series_211 * Resistor_211) = total_resistence_in series
// Similarly write formula for parallel
(selected_in_parallel_1 * 1/Resistor_1) + (selected_in_parallel_2 * 1/Resistor_2) + ..(selected_in_parallel_211 * 1/Resistor_211) = 1/total_resistence_in parallel
total_resistence_in series + total_resistence_in parallel = Required Resistance Of Circuit
I want to program a chess engine which learns to make good moves and win against other players. I've already coded a representation of the chess board and a function which outputs all possible moves. So I only need an evaluation function which says how good a given situation of the board is. Therefore, I would like to use an artificial neural network which should then evaluate a given position. The output should be a numerical value. The higher the value is, the better is the position for the white player.
My approach is to build a network of 385 neurons: There are six unique chess pieces and 64 fields on the board. So for every field we take 6 neurons (1 for every piece). If there is a white piece, the input value is 1. If there is a black piece, the value is -1. And if there is no piece of that sort on that field, the value is 0. In addition to that there should be 1 neuron for the player to move. If it is White's turn, the input value is 1 and if it's Black's turn, the value is -1.
I think that configuration of the neural network is quite good. But the main part is missing: How can I implement this neural network into a coding language (e.g. Delphi)? I think the weights for each neuron should be the same in the beginning. Depending on the result of a match, the weights should then be adjusted. But how? I think I should let 2 computer players (both using my engine) play against each other. If White wins, Black gets the feedback that its weights aren't good.
So it would be great if you could help me implementing the neural network into a coding language (best would be Delphi, otherwise pseudo-code). Thanks in advance!
In case somebody randomly finds this page. Given what we know now, what the OP proposes is almost certainly possible. In fact we managed to do it for a game with much larger state space - Go ( https://deepmind.com/research/case-studies/alphago-the-story-so-far ).
I don't see why you can't have a neural net for a static evaluator if you also do some classic mini-max lookahead with alpha-beta pruning. Lots of Chess engines use minimax with a braindead static evaluator that just adds up the pieces or something; it doesn't matter so much if you have enough levels of minimax. I don't know how much of an improvement the net would make but there's little to lose. Training it would be tricky though. I'd suggest using an engine that looks ahead many moves (and takes loads of CPU etc) to train the evaluator for an engine that looks ahead fewer moves. That way you end up with an engine that doesn't take as much CPU (hopefully).
Edit: I wrote the above in 2010, and now in 2020 Stockfish NNUE has done it. "The network is optimized and trained on the [classical Stockfish] evaluations of millions of positions at moderate search depth" and then used as a static evaluator, and in their initial tests they got an 80-elo improvement when using this static evaluator instead of their previous one (or, equivalently, the same elo with a little less CPU time). So yes it does work, and you don't even have to train the network at high search depth as I originally suggested: moderate search depth is enough, but the key is to use many millions of positions.
Been there, done that. Since there is no continuity in your problem (the value of a position is not closely related to an other position with only 1 change in the value of one input), there is very little chance a NN would work. And it never did in my experiments.
I would rather see a simulated annealing system with an ad-hoc heuristic (of which there are plenty out there) to evaluate the value of the position...
However, if you are set on using a NN, is is relatively easy to represent. A general NN is simply a graph, with each node being a neuron. Each neuron has a current activation value, and a transition formula to compute the next activation value, based on input values, i.e. activation values of all the nodes that have a link to it.
A more classical NN, that is with an input layer, an output layer, identical neurons for each layer, and no time-dependency, can thus be represented by an array of input nodes, an array of output nodes, and a linked graph of nodes connecting those. Each node possesses a current activation value, and a list of nodes it forwards to. Computing the output value is simply setting the activations of the input neurons to the input values, and iterating through each subsequent layer in turn, computing the activation values from the previous layer using the transition formula. When you have reached the last (output) layer, you have your result.
It is possible, but not trivial by any means.
https://erikbern.com/2014/11/29/deep-learning-for-chess/
To train his evaluation function, he utilized a lot of computing power to do so.
To summarize generally, you could go about it as follows. Your evaluation function is a feedforward NN. Let the matrix computations lead to a scalar output valuing how good the move is. The input vector for the network is the board state represented by all the pieces on the board so say white pawn is 1, white knight is 2... and empty space is 0. An example board state input vector is simply a sequence of 0-12's. This evaluation can be trained using grandmaster games (available at a fics database for example) for many games, minimizing loss between what the current parameters say is the highest valuation and what move the grandmasters made (which should have the highest valuation). This of course assumes that the grandmaster moves are correct and optimal.
What you need to train a ANN is either something like backpropagation learning or some form of a genetic algorithm. But chess is such an complex game that it is unlikly that a simple ANN will learn to play it - even more if the learning process is unsupervised.
Further, your question does not say anything about the number of layers. You want to use 385 input neurons to encode the current situation. But how do you want to decide what to do? On neuron per field? Highest excitation wins? But there is often more than one possible move.
Further you will need several hidden layers - the functions that can be represented with an input and an output layer without hidden layer are really limited.
So I do not want to prevent you from trying it, but chances for a successful implemenation and training within say one year or so a practically zero.
I tried to build and train an ANN to play Tic-tac-toe when I was 16 years or so ... and I failed. I would suggest to try such an simple game first.
The main problem I see here is one of training. You say you want your ANN to take the current board position and evaluate how good it is for a player. (I assume you will take every possible move for a player, apply it to the current board state, evaluate via the ANN and then take the one with the highest output - ie: hill climbing)
Your options as I see them are:
Develop some heuristic function to evaluate the board state and train the network off that. But that begs the question of why use an ANN at all, when you could just use your heuristic.
Use some statistical measure such as "How many games were won by white or black from this board configuration?", which would give you a fitness value between white or black. The difficulty with that is the amount of training data required for the size of your problem space.
With the second option you could always feed it board sequences from grandmaster games and hope there is enough coverage for the ANN to develop a solution.
Due to the complexity of the problem I'd want to throw the largest network (ie: lots of internal nodes) at it as I could without slowing down the training too much.
Your input algorithm is sound - all positions, all pieces, and both players are accounted for. You may need an input layer for every past state of the gameboard, so that past events are used as input again.
The output layer should (in some form) give the piece to move, and the location to move to.
Write a genetic algorithm using a connectome which contains all neuron weights and synapse strengths, and begin multiple separated gene pools with a large number of connectomes in each.
Make them play one another, keep the best handful, crossover and mutate the best connectomes to repopulate the pool.
Read blondie24 : http://www.amazon.co.uk/Blondie24-Playing-Kaufmann-Artificial-Intelligence/dp/1558607838.
It deals with checkers instead of chess but the principles are the same.
Came here to say what Silas said. Using a minimax algorithm, you can expect to be able to look ahead N moves. Using Alpha-beta pruning, you can expand that to theoretically 2*N moves, but more realistically 3*N/4 moves. Neural networks are really appropriate here.
Perhaps though a genetic algorithm could be used.