I would like to build a Connect 4 engine which works using an artificial neural network - just because I'm fascinated by ANNs.
I'be created the following draft of the ANN structure. Would it work? And are these connections right (even the cross ones)?
Could you help me to draft up an UML class diagram for this ANN?
I want to give the board representation to the ANN as its input. And the output should be the move to chose.
The learning should later be done using reinforcement learning and the sigmoid function should be applied. The engine will play against human players. And depending on the result of the game, the weights should be adjusted then.
What I'm looking for ...
... is mainly coding issues. The more it goes away from abstract thinking to coding - the better it is.
The below is how I organized my design and code when I was messing with neural networks. The code here is (obviously) psuedocode and roughly follows Object Oriented conventions.
Starting from the bottom up, you'll have your neuron. Each neuron needs to be able to hold the weights it puts on the incoming connections, a buffer to hold the incoming connection data, and a list of its outgoing edges. Each neuron needs to be able to do three things:
A way to accept data from an incoming edge
A method of processing the input data and weights to formulate the value this neuron will be sending out
A way of sending out this neuron's value on the outgoing edges
Code-wise this translates to:
// Each neuron needs to keep track of this data
float in_data[]; // Values sent to this neuron
float weights[]; // The weights on each edge
float value; // The value this neuron will be sending out
Neuron out_edges[]; // Each Neuron that this neuron should send data to
// Each neuron should expose this functionality
void accept_data( float data ) {
in_data.append(data); // Add the data to the incoming data buffer
}
void process() {
value = /* result of combining weights and incoming data here */;
}
void send_value() {
foreach ( neuron in out_edges ) {
neuron.accept_data( value );
}
}
Next, I found it easiest if you make a Layer class which holds a list of neurons. (It's quite possible to skip over this class, and just have your NeuralNetwork hold a list of list of neurons. I found it to be easier organizationally and debugging-wise to have a Layer class.) Each layer should expose the ability to:
Cause each neuron to 'fire'
Return the raw array of neurons that this Layer wraps around. (This is useful when you need to do things like manually filling in input data in the first layer of a neural network.)
Code-wise this translates to:
//Each layer needs to keep track of this data.
Neuron[] neurons;
//Each layer should expose this functionality.
void fire() {
foreach ( neuron in neurons ) {
float value = neuron.process();
neuron.send_value( value );
}
}
Neuron[] get_neurons() {
return neurons;
}
Finally, you have a NeuralNetwork class that holds a list of layers, a way of setting up the first layer with initial data, a learning algorithm, and a way to run the whole neural network. In my implementation, I collected the final output data by adding a fourth layer consisting of a single fake neuron that simply buffered all of its incoming data and returned that.
// Each neural network needs to keep track of this data.
Layer[] layers;
// Each neural network should expose this functionality
void initialize( float[] input_data ) {
foreach ( neuron in layers[0].get_neurons() ) {
// do setup work here
}
}
void learn() {
foreach ( layer in layers ) {
foreach ( neuron in layer ) {
/* compare the neuron's computer value to the value it
* should have generated and adjust the weights accordingly
*/
}
}
}
void run() {
foreach (layer in layers) {
layer.fire();
}
}
I recommend starting with Backwards Propagation as your learning algorithm as it's supposedly the easiest to implement. When I was working on this, I had great difficulty trying to find a very simple explanation of the algorithm, but my notes list this site as being a good reference.
I hope that's enough to get you started!
There are a lot of different ways to implement neural networks that range from simple/easy-to-understand to highly-optimized. The Wikipedia article on backpropagation that you linked to has links to implementations in C++, C#, Java, etc. which could serve as good references, if you're interested in seeing how other people have done it.
One simple architecture would model both nodes and connections as separate entities; nodes would have possible incoming and outgoing connections to other nodes as well as activation levels and error values, whereas connections would have weight values.
Alternatively, there are more efficient ways to represent those nodes and connections -- as arrays of floating point values organized by layer, for example. This makes things a bit trickier to code, but avoids creating so many objects and pointers to objects.
One note: often people will include a bias node -- in addition to the normal input nodes -- that provides a constant value to every hidden and output node.
I've implemented neural networks before, and see a few problems with your proposed architecture:
A typical multi-layer network has connections from every input node to every hidden node, and from every hidden node to every output node. This allows information from all of the inputs to be combined and contribute to each output. If you dedicate 4 hidden nodes to each input then you will losing some of the network's power to identify relationships between the inputs and outputs.
How will you come up with values to train the network? Your network creates a mapping between board positions and the optimal next move, so you need a set of training examples that provide this. End game moves are easy to identify, but how do you tell that a mid-game move is "optimal"? (Reinforcement learning can help out here)
One last suggestion is to use bipolar inputs (-1 for false, +1 for true) since this can speed up learning. And Nate Kohl makes a good point: every hidden & output node will benefit from having a bias connection (think of it as another input node with a fixed value of "1").
Your design will be highly dependant on the specific type of reinforcment learning that you plan to use.
The simplest solution would be to use back propogation. This is done by feeding the error back into the network (in reverse fashion) and using the inverse of the (sigmoid) function to determine the adjustment to each weight. After a number of iterations, the weights will automatically get adjusted to fit the input.
Genetic Algorithms are an alternative to back-propogation which yield better results (although a bit slower). This is done by treating the weights as a schema that can easily be inserted and removed. The schema is replaced with a mutated version (using principles of natural selection) several times until a fit is found.
As you can see, the implementation for each of these would be drastically different. You could try to make the network generic enough to adapt to each type of implementation but that may overcomplicate it. Once you are in production, you will usually only have one form of training (or ideally your network would already be trainined).
Related
i have fairly large files with 3D scan points (200.000 ish) and try to make a TopoDS_Shape with BRepOffsetAPI_Sewing
gp_Pnt p1(0,0,100);
...
TopoDS_Edge e1 = BRepBuilderAPI_MakeEdge( p4, p1);
...
TopoDS_Wire w1 = BRepBuilderAPI_MakeWire(e1, e2, e3, e4);
...
TopoDS_Face f1 = BRepBuilderAPI_MakeFace(w1);
...
BRepOffsetAPI_Sewing sew(0.1);
sew.Add(f1);sew.Perform();TopoDS_Shape sewedShape = sew.SewedShape();
of corse with all the points in loops etc. above code is just a sample how I try to create things.
with 200.000 points it takes 20-30 second to produce the face.
my next approach was to save the produced shape after generated and load it later as a workaround.
BRepTools::Write(sewedShape, sFile);
but even that is slow.
I did similar things in Java3D and it was way faster. So I make something wrong.
only showing the points with
Handle_Graphic3d_ArrayOfPoints points3d = new Graphic3d_ArrayOfPoints(totPoints, true, false);
gp_Pnt pnt(x, y, z);
points3d->AddVertex(pnt, aColor); // adding 200.000 points
Handle(AIS_PointCloud) m_points = new AIS_PointCloud();
m_points->SetPoints(points3d);
m_occView->getContext()->Display(m_points, true);
is almost instant (less then a second)
my goal is to build 2 of those faces and find the intersection with OCBRepAlgoAPI_Section
Thanks for help in advance!
As far as I understand, your current approach is:
Create a TopoDS_Face per quad in Point Cloud.
To avoid unnecessary overhead on sewing operation, you would need reconsidering your workflow and create shared shapes from the scratch. E.g., instead of creating TopoDS_Vertex for the same point in Point Cloud multiple times, you should create a single one and reuse it in construction of connected edges / quads; the same applies to TopoDS_Edge construction. What sewing operation does for you is finding and repairing shared information between geometrically connected faces - which is a plenty of work that could be entirely avoided.
But as you have been already pointed out (by trying to dump produced shape into a file), mapping tessellation to B-Rep is counter-efficient approach in general. Just take a look at all these TopoDS_Vertex, TopoDS_Edge, TopoDS_Wire, TopoDS_Face to figure out how much more data structures are needed in B-Rep for mapping a very single triangle or quad. This structure is heavy not only from memory utilization point of view, but also for algorithms you might want to do with it like Boolean operations.
Possible alternatives:
Create a Poly_Triangulation from your point cloud and a single TopoDS_Face from it. You would be able to efficiently visualize it in 3D Viewer and perform some operations like computing surface area. Unfortunately, such geometry definition is not yet supported by all OCCT algorithms, so that you wouldn't be able performing Boolean operations.
Create an approximated B-Spline surface from your Point Clouds. This could be done with help of GeomPlate or SSP (Surface from Scattered Points) algorithms. Approximated surface would be a better fit to B-Rep geometry definition, but might loose some details of original surface and might be tricky to apply (you might need splitting a complex surface into several pieces).
Use OMF product (Mesh Framework) to perform Boolean operations on meshes. In case if Boolean operation on meshes is all you need, OMF could be helpful.
Using Windows API, I want to implement something like following:
i.e. Getting current microphone input level.
I am not allowed to use external audio libraries, but I can use Windows libraries. So I tried using waveIn functions, but I do not know how to process audio input data in real time.
This is the method I am currently using:
Record for 100 milliseconds
Select highest value from the recorded data buffer
Repeat forever
But I think this is way too hacky, and not a recommended way. How can I do this properly?
Having built a tuning wizard for a very dated, but well known, A/V conferencing applicaiton, what you describe is nearly identical to what I did.
A few considerations:
Enqueue 5 to 10 of those 100ms buffers into the audio device via waveInAddBuffer. IIRC, when the waveIn queue goes empty, weird things happen. Then as the waveInProc callbacks occurs, search for the sample with the highest absolute value in the completed buffer as you describe. Then plot that onto your visualization. Requeue the completed buffers.
It might seem obvious to map the sample value as follows onto your visualization linearly.
For example, to plot a 16-bit sample
// convert sample magnitude from 0..32768 to 0..N
length = (sample * N) / 32768;
DrawLine(length);
But then when you speak into the microphone, that visualization won't seem as "active" or "vibrant".
But a better approach would be to give more strength to those lower energy samples. Easy way to do this is to replot along the μ-law curve (or use a table lookup).
length = (sample * N) / 32768;
length = log(1+length)/log(N);
length = max(length,N)
DrawLine(length);
You can tweak the above approach to whatever looks good.
Instead of computing the values yourself, you can rely on values from Windows. This is actually the values displayed in your screenshot from the Windows Settings.
See the following sample for the IAudioMeterInformation interface:
https://learn.microsoft.com/en-us/windows/win32/coreaudio/peak-meters.
It is made for the playback but you can use it for capture also.
Some remarks, if you open the IAudioMeterInformation for a microphone but no application opened a stream from this microphone, then the level will be 0.
It means that while you want to display your microphone peak meter, you will need to open a microphone stream, like you already did.
Also read the documentation about IAudioMeterInformation it may not be what you need as it is the peak value. It depends on what you want to do with it.
Hello neural enthusiasts out there,
i am a little bit confused about the SOM Learning Algorithm in AForge.
I figured out that the implementation assumes the most common case, a 2 dimensial SOM.
When i take a look at other SOM Graphics in the web, it figures out, that the position of the neuron changes over time. Similar neurons are put together.
I took a look at the source code and found out that the position of the neurons in the map is some kind of fixed. It is:
int wx = neuronIndex % width;
int wy = neuronIndex / width;
Is this just another Type of SOM with fixed possitions, or am i missinterprating something?
I also thought that mainly you want to get an informational graphic out of a SOM,
but there are no public available methods to retrieve the position of a Neuron.
Not familiar with AForge, but....
EDIT: First I was thinking that the weights are 2D and taught to resemble a grid, but this is even more educated guess: The moving grid of neurons you've seen is still not the SOM's nodes. The position of a SOM node is constant. The SOM is taught to abstract some dataset and a Sammon's mapping is likely used as a visualization method for the nodes' weights. The result is something like this and can probably be confused with the original SOM's lattice in which the nodes or "neurons" never move.
Note that this is still just an educated guess.
I have some code for a single layer neural network:
class network {
var outputs;
var weights;
var biases;
feedforward(inputs) {
}
outputFunction(number) {
}
}
The output function is a sigmoid (so returns a number between 0 and 1). The inputs are an array of 1s and 0s.
I added a hidden layer by adding outputs2, weights2, biases2, and then doing:
feedforward2(inputs) {
use weights2, biases2, etc.
}
feedforwad(inputs) {
inputs = feedforward2(inputs)
....
}
I figured that the inputs of the output nodes are now the outputs of my hidden layer, so it should at least have similar performance. Yet, performance has drastically reduced after training the network again. Any ideas? Training does not have backpropagation to the hidden layer yet, it just updates the weights of the output layer and the hidden layer weights stay the same always.
If the hidden layer weights are random and fixed, then all they do is distort the signal.
Training multilayer networks is difficult. The vast majority of them has only a single hidden layer, with the exceptions of convolutional networks and some recent work on deep belief networks.
I recently asked this question:
I am looking for an algorithm to detect pitch. one of the answers suggested that I use an initial FFT to get the basic frequency response, figure out which frequencies are getting voiced, and follow it up with a band pass filter in each area of interest:
A slightly advanced algorithm could do something like this:
Roughly detect pitch frequency (could be done with DFT).
Bandpass signal to filter isolate pitch frequency.
Count the number of samples between two peaks in the filtered signals.
Now I can do the first step okay ( I am coding for iOS, and Apple has a framework (the accelerate framework) for doing FFTs etc.
I have made a start here: but I can see the problem: an FFT that would differentiate all of the possible notes one could sing would require a lot of samples, and I don't want to perform too much unnecessary computation as I'm targeting a mobile device.
So I'm trying to get my head round this answer above, but I don't understand how I could apply the concept of a band pass filter to code.
Can anyone help?
Filter design is pretty complex. There are many techniques. First you have to decide what kind of filter you want to create. Finite impulse response (FIR)? Infinite impulse response (IIR)? Then you select an algorithm for designing a filter of that type. The Remez algorithm is often used for FIR filter design. Go here to see the complexity that I was referring to: http://en.wikipedia.org/wiki/Remez_algorithm
Your best best for creating a filter is to use an existing signal processing library. A quick Google search led me here: http://spuc.sourceforge.net/
Given what your application is, you may want to read about matched filters. I am not sure if they are relevant here, but they might be. http://en.wikipedia.org/wiki/Matched_filter
well in Wikipedia, checkup on low-pass filter, and hi-pass, then join them to make a band-pass filter. Wikipedia has code implementations for those two filters.
http://en.wikipedia.org/wiki/Low-pass_filter
http://en.wikipedia.org/wiki/High-pass_filter
Since you only want to detect a single frequency, it would be an overkill to perform a DFT to then only use one of the values.
You could implement the Goertzel algorithm. Like this C implementation used to detect DTMF tones over a phone line, from the FreePBX source code:
float goertzel(short x[], int nmax, float coeff) {
float s, power;
float sprev, sprev2;
int n;
sprev = 0;
sprev2 = 0;
for(n=0; n<nmax; n++) {
s = x[n] + coeff * sprev - sprev2;
sprev2 = sprev;
sprev = s;
}
power = sprev2*sprev2 + sprev*sprev - coeff*sprev*sprev2;
return power;
}
As you can see, the implementation is fairly trivial and quite effective for single frequencies. Check the link for different versions with and without floating point, and how to use it.