I have a good basis on Evolutionary Algorithms, so now i started to read about Artificial Neural Networks. I come across this tutorial on
http://www.ai-junkie.com/ann/evolved/nnt2.html,
showing how to use a ANN to evolve Tanks that collect mines. It uses a GA to evolve the input weights on each Neuron.
I know i could use GA (without the ANN) to solve the same problem. I already created a Tetris Bot using only GA to optimize the weights in the grid evaluation function (check my blog http://www.bitsrandomicos.blogspot.com.br/).
My question is: what's the conceptual/practical difference between using a ANN + GA in a situation where i could use GA alone? I mean, is my Tetris Bot a ANN?(I don't think so).
There are several related questions about this, but i couldn't find a answer:
Are evolutionary algorithms and neural networks used in the same domains?
When to use Genetic Algorithms vs. when to use Neural Networks?
Thanks!
A genetic algorithm is an optimization algorithm.
An artificial neural network is a function approximator. In order to approximate a function you need an optimization algorithm to adjust the weights. An ANN can be used for supervised learning (classification, regression) or reinforcement learning and some can even be used for unsupervised learning.
In supervised learning a derivative-free optimization algorithm like a genetic algorithm is slower than most of the optimization algorithms that use gradient information. Thus, it only makes sense to evolve neural networks with genetic algorithms in reinforcement learning. This is known as "neuroevolution". The advantage of neural networks like multilayer perceptrons in this setup is that they can approximate any function with arbitrary precision when they have a suffficient number of hidden nodes.
When you create a tetris bot you do not necessarily have to use an ANN as a function approximator. But you need some kind of function approximator to represent your bot's policy. I guess it was just simpler than an ANN. But when you want to create a complex nonlinear policy you could do that e. g. with an ANN.
alfa's answer is perfect. Here is just an image to illustrate what he said:
Meta-Optimizer = None (but could be)
Optimizer = Genetic Algorithm
Problem = Tetris Bot (e.g. ANN)
You use evolutionary algorithm if you yet don't know the answer but you are able to somehow rate candidates and provide meaningful mutations.
Neural network is great if you already have answers (and inputs) and you want to "train the computer" so it can "guess" the answers for unknown inputs. Also, you don't have to think a lot about the problem, the network will figure it out by itself.
Check this "game AI" example: https://synaptic.juancazala.com/#/
(note how simple it is, all you have to do is to give them enough training, you don't have to know a thing about game AI - and once it is good enough all you have to do is to "download" memory and run it when needed)
I'm not an expert, but based on what I know from the field..
An artificial neural network has a basis on neuroscience ultimately. It attempts to simulate/model its behavior through building a neuron-like structures in the algorithm. There is a strong emphasis on the academic nature of the problem than the result. From what I understand, its for this reason that ANN's are not very popular from an engineering standpoint. Statistical basis of machine learning (HMM's and Bayesian networks) produce better results.
In short, so as long as it has a nod towards some underlying neurosciency subject, it can be a ANN, even if it uses some form of GA.
If you use a GA, it is not necessarily an ANN.
Related
Compared to mechanical engineering, computer engineering, or software engineering how do the mathematics compare? What should be mathematics that I should start focusing on learning now or should expect to learn if I want to become a researcher in the field or an industry expert? I am currently a senior in high school who is considering AI. Math doesn't scare me.
In AI one of the most important goals is to make computer act(and think!) like humans. For this purpose computers must learn models from observations(data) and act based on that model. This learning and prediction needs deep understanding of probability theory, statistics and stochastic processes as fundamental tools.
Today, probability and statistics are considered general mathematics like calculus and all undergraduate students are familiar with them, but you need to master them if your research field is in AI.
I would look into the following:
Probability - Bayesian Theory
Statistics - Data Interpretation, Graph Plotting, Graph Error Handling
Stochastic Theory
Entropy Theory (for finding degree of errant data)
Matrices and their computational formulae, use stochastic matrices
Since AI uses a lot of trees and graphs, a look into state space search and heuristic calculation would be quite useful..
I am new to machine learning. I am familiar with SVM , Neural networks and GA. I'd like to know the best technique to learn for classifying pictures and audio. SVM does a decent job but takes a lot of time. Anyone know a faster and better one? Also I'd like to know the fastest library for SVM.
Your question is a good one, and has to do with the state of the art of classification algorithms, as you say, the election of the classifier depends on your data, in the case of images, I can tell you that there is one method called Ada-Boost, read this and this to know more about it, in the other hand, you can find lots of people are doing some researh, for example in Gender Classification of Faces Using Adaboost [Rodrigo Verschae,Javier Ruiz-del-Solar and Mauricio Correa] they say:
"Adaboost-mLBP outperforms all other Adaboost-based methods, as well as baseline methods (SVM, PCA and PCA+SVM)"
Take a look at it.
If your main concern is speed, you should probably take a look at VW and generally at stochastic gradient descent based algorithms for training SVMs.
if the number of features is large in comparison to the number of the trainning examples
then you should go for logistic regression or SVM without kernel
if the number of features is small and the number of training examples is intermediate
then you should use SVN with gaussian kernel
is the number of features is small and the number of training examples is large
use logistic regression or SVM without kernels .
that's according to the stanford ML-class .
For such task you may need to extract features first. Only after that classification is feasible.
I think feature extraction and selection is important.
For image classification, there are a lot of features such as raw pixels, SIFT feature, color, texture,etc. It would be better choose some suitable for your task.
I'm not familiar with audio classication, but there may be some specturm features, like the fourier transform of the signal, MFCC.
The methods used to classify is also important. Besides the methods in the question, KNN is a reasonable choice, too.
Actually, using what feature and method is closely related to the task.
The method mostly depends on problem at hand. There is no method that is always the fastest for any problem. Having said that, you should also keep in mind that once you choose an algorithm for speed, you will start compromising on the accuracy.
For example- since your trying to classify images, there might a lot of features compared to the number of training samples at hand. In such cases, if you go for SVM with kernels, you could end up over fitting with the variance being too high.
So you would want to choose a method that has a high bias and low variance. Using logistic regression or linear SVM are some ways to do it.
You could also use different types of regularizations or techniques such as SVD to remove the features that do not contribute much to your output prediction and have only the most important ones. In other words, choose the features that have little or no correlation between them. Once you do this, you would be able to speed yup your SVM algorithms without sacrificing the accuracy.
Hope it helps.
there are some good techniques in learning machines such as, boosting and adaboost.
One method of classification is the boosting method. This method will iteratively manipulate data which will then be classified by a particular base classifier on each iteration, which in turn will build a classification model. Boosting uses weighting of each data in each iteration where its weight value will change according to the difficulty level of the data to be classified.
While the method adaBoost is one ensamble technique by using loss function exponential function to improve the accuracy of the prediction made.
I think your question is very open ended, and "best classifier for images" will largely depend on the type of image you want to classify. But in general, I suggest you study convulutional neural networks ( CNN ) and transfer learning, currently these are the state of the art techniques for the problem.
check out pre-trained models of cnn based neural networks from pytorch or tensorflow
Related to images I suggest you also study pre-processing of images, pre-processing techniques are very important to highlight some feature of the image and improve the generalization of the classifier.
I am looking for recommended books (or other materials, like web pages) which demonstrate such examples -- structure of neural network (artificial) for given function.
I.e. what is the best (in sense, of being minimalistic, yet correct) network structure for function min with N arguments. Or for function abs. And so on.
The reason for my question (what books do you recommend?) is I would like to get proper "feeling" how to shape the network to get the desired effect without overkill having dense network which computes correctly, but very inefficiently.
There is no such thing as "the best NN structure". If you are lucky, you will find a structure that does the job, but that doesn't mean that that structure is the "best".
I highly recommend you reading Programming Collective Intelligence by Toby Segaran. This book has a chapter on neural network. It explains many other artificial intelligence algorithms in a clear and concise way.
You may find additional wide reviews on neural networks here and here
There is a lecture course on iTunes-U called "Informatics for Nursing" that contains several lectures dedicated to ANN's
UPDATE June 2019: the iTunes-U course is no longer available and I couldn't find it elsewhere
Good luck
Just wondering, since we've reached 1 teraflop per PC, yet we are still not able to model an insect's brain.
Has anyone seen a decent implementation of a self-learning, self-developing neural network?
I saw an interesting experiment mapping the physical neural layout of a rat's brain to a digital neural network with weighting modelled on the neuron chemistry of each component taken using MRI and others. Quite interesting. (new scientist or Focus, 2 issues ago?)
IBM Blue Brain comes to mind
http://news.bbc.co.uk/1/hi/sci/tech/8012496.stm
The problem is computation power as you rightly point out. But for a sequence of stimuli to a neural network the range of calculations tends to be exponential as that stimuli encounters deeper nested nodes. Any complex weighting algorithm means that time spent at each node can get expensive. Domain specific neural-maps tend to be quicker because they are specialized. Brains in mammals have many general paths, making it harder to teach them, and for a computer to model a real mammal brain in a given space/time.
Real brains also have tons of cross-talk like static (some people think this is where creativity or original thought stems from). Brains also don't learn using 'direct' stimulus/reward ... they use past experience of non-related matter to create their own learning. Recreating the neurons is one thing in a computational space, creating an accurate learning is another. Never-mind the dopamine (octopamine in insects) and other neurological chemicals.
imagine giving a digital brain LSD or anti-depressants. As a real simulation. Awesome. That would be a complex simulation I suspect.
I think you're kind of making the assumption that our idea of how neural networks work is a good model for the brain at a large-scale level; I'm not sure that is a good assumption. Hell, not too many years ago we didn't think the glial cells were important to mental functions, and it was the idea for a long time that there is no neurogenesis after the brain matures.
On the other hand, neural networks do seem to handle some apparently complex functions pretty well.
So, here's a little puzzle question for you: how many teraflops or petaflops do you think a human brain's computation represents?
Jeff Hawkins would say that a neural net is a poor approximation of a brain. His "On Intelligence" is a terrific read.
Yup: OpenCog is working on it.
It's the structure. Even if we had computers today with the same or higher performance than a human brain (there are different predictions when we'll get there, but there are still a few years to go), we still need to program it. And while we know a lot of the brain today, there are still many, many more things we do not know. And these aren't just details, but large areas that are not understood at all.
Focusing only on the Tera-/Peta-FLOPS is like looking only at megapixels with digital cameras: it focuses on only one value when there are many factors involved (and there are a few more of those in a brain than in a camera). I also believe that many of the estimates just how many FLOPS would be needed to simulate a brain are way off - but that's a different discussion altogether.
Just wondering, we've reached 1 teraflop per PC, and we are still not able to model an insect's brain. has anyone seen a decent implementation of a self-learning self-developing neural network?
We can already model brains. The question these days, is how fast, and how accurate.
In the beginning, there was effort expended on trying to find the most abstract representation of neurons with the least amount of physical properties needed.
This led to the invention of the perceptron at Cornell University, which is a very simple model indeed. In fact, it may have been too simple, as the famous MIT AI professor, Marvin Minsky, wrote a paper which mistakenly concluded that it would be impossible for this type of model to learn XOR (a basic logic gate that could be emulated by every computer we have today). Unfortunately, his paper plunged neural network research into the dark ages for at least 10 years.
While probably not as impressive as many would like, there are learning networks that are already in existence that can do visual and speech learning and recognition.
And even though we have faster CPUs, it is still not the same as a neuron. Neurons in our brain are, at the very least, parallel adder units. So imagine 100 billion simulated human neurons, adding each second, sending their outputs to 100 trillion connections with a "clock" of about 20hz. The amount of computation going on here far exceeds the petaflops of processing power we have, especially when our cpus are mostly serial instead of parallel.
In 2007, they simulated the equivalent of a half mouse brain for 10 seconds at half the actual speed: http://news.bbc.co.uk/1/hi/technology/6600965.stm
There is a worm named C. Elegance and its anatomy is completely know to us. Every cell is mapped out and every neuron is well studied. This worm has an interesting property by birth and that is it follows or grow towards only those temperature regions in which it was born. Here is link to the paper. This paper has implementation of the property with neuronal model. And there are some students who have built robot that only follows dark regions in the region having different shades of light, using this neuronal model. This work could have been done using other methods as well but this method is more noise resilient as proved by paper to which I have given link above.
I have asked other AI folk this question, but I haven't really been given an answer that satisfied me.
For anyone else that has programmed an artificial neural network before, how do you test for its correctness?
I guess, another way to put it is, how does one debug the code behind a neural network?
With neural networks, generally what is happening is you are taking an untrained neural network, and you are training it up using a given set of data, so that it responds in the way you expect. Here's the deal; usually, you're training it up to a certain confidence level for your inputs. Generally (and again, this is just generally; your mileage may vary), you cannot get neural networks to always provide the right answer; rather, you are getting the estimation of the right answer, to within a confidence range. You know that confidence range by how you have trained the network.
The question arises as to why you would want to use neural networks if you cannot be certain that the conclusion they come to is verifiably correct; the answer is that neural networks can arrive at high-confidence answers for certain classes of problems (specifically, NP-Complete problems) in linear time, whereas verifiably correct solutions of NP-Complete problems can only be arrived at in polynomial time. In layman's terms, neural networks can "solve" problems that normal computation can't; but you can only be a certain percentage confident that you have the right answer. You can determine that confidence by the training regimen, and can usually make sure that you will have at least 99.9% confidence.
Correctness is a funny concept in most of "soft computing." The best I can tell you is: "a neural network is correct when it consistently satisfies the parameters of it's design." You do this by training it with data, and then verifying with other data, and having a feedback loop in the middle which lets you know if the neural network is functioning appropriately.
This is of-course the case only for neural networks that are large enough where a direct proof of correctness is not possible. It is possible to prove that a neural network is correct through analysis if you are attempting to build a neural network that learns XOR or something similar, but for that class of problem an aNN is seldom necessary.
You're opening up a bigger can of worms here than you might expect.
NN's are perhaps best thought of as universal function approximators, by the way, which may help you in thinking about this stuff.
Anyway, there is nothing special about NN's in terms of your question, the problem applies to any sort of learning algorithm.
The confidence you have in the results it is giving is going to rely on both the quantity and the quality (often harder to determine) of the training data that you have.
If you're really interested in this stuff, you may want to read up a bit on the problems of overtraining, and ensemble methods (bagging, boosting, etc.).
The real problem is that you usually aren't actually interested in the "correctness" (cf quality) of an answer on a given input that you've already seen, rather you care about predicting the quality of answer on an input you haven't seen yet. This is a much more difficult problem. Typical approaches then, involve "holding back" some of your training data (i.e. the stuff you know the "correct" answer for) and testing your trained system against that. It gets subtle though, when you start considering that you may not have enough data, or it may be biased, etc. So there are many researchers who basically spend all of their time thinking about these sort of issues!
I've worked on projects where there is test data as well as training data, so you know the expected outputs for a set of inputs the NN hasn't seen.
One common way of analysing the result of any classifier is use of an ROC curve; an introduction to the statistics of classifiers and ROC curves can be found at Interpreting Diagnostic Tests
I'm a complete amateur in this field, but don't you use a pre-determined set of data you know is correct?
I don't believe there is a single correct answer but there are well-proven probabilistic or statistical methods that can provide reassurance. The statistical methods are usually referred to as Resampling.
One method that I can recommend is the Jackknife.
My teacher always said his rule of thumb was to train the NN with 80% of your data and validate it with the other 20%. And, of course, make sure that data set is as comprehensive as you need.
If you want to find out whether the backpropagation of the network is correct, there is an easy way.
Since you calculate the derivate of the error landscape, you can check whether your implementation is correct numerically. You will calculate the derivative of the error with respect to a specific weight, ∂E/∂w. You can show that
∂E/∂w = (E(w + e) - E(w - e)) / (2 * e) + O(e^2).
(Bishop, Machine Learning and Pattern Recognition, p. 246)
Essentially, you evaluate the error to the left of the weight, evaluate it to the right of the weight and chheck if the numerical gradient is the same as your analytical gradient.
(Here's an implementation: http://github.com/bayerj/arac/raw/9f5b225d6293974f8adfc5f20dfc6439cc1bed35/src/cpp/utilities/utilities.cpp)
To me probably there is only one value(s) takes extra effort to verify, the gradient of the back propagation. I think Bayer's answer is actually commonly used and suggested. You need to write extra code to this but all are forward propagation matrix multiplications which is easy to write and verify.
There are some other issues which will prevent you from getting the best answer, for example:
The cost function of NN is not concave so your gradient descent is not guaranteed to find the global optimum.
Over/under fitting
Not choosing the "right" features/model
etc
However I think they are beyond the scope of programming bug.