Compared to mechanical engineering, computer engineering, or software engineering how do the mathematics compare? What should be mathematics that I should start focusing on learning now or should expect to learn if I want to become a researcher in the field or an industry expert? I am currently a senior in high school who is considering AI. Math doesn't scare me.
In AI one of the most important goals is to make computer act(and think!) like humans. For this purpose computers must learn models from observations(data) and act based on that model. This learning and prediction needs deep understanding of probability theory, statistics and stochastic processes as fundamental tools.
Today, probability and statistics are considered general mathematics like calculus and all undergraduate students are familiar with them, but you need to master them if your research field is in AI.
I would look into the following:
Probability - Bayesian Theory
Statistics - Data Interpretation, Graph Plotting, Graph Error Handling
Stochastic Theory
Entropy Theory (for finding degree of errant data)
Matrices and their computational formulae, use stochastic matrices
Since AI uses a lot of trees and graphs, a look into state space search and heuristic calculation would be quite useful..
Related
I'm reading a book, "AI for Game Developers" by Glenn Seemann and David M Bourg, where they use video game AI as an example of a rule-based system which learns.
Essentially, the player has 3 possible moves, and hits in combos of three strikes. The AI is aiming to predict the player's third strikes. The rules of the system are all the possible 3-move combinations. Each rule has a "weight" associated to it. Every time the system guesses incorrectly, the weight of a rule is decreased. When the system has to pick a rule, it picks the rule with the highest weight.
How is this any different from a reinforcement-learning based system? Thanks!
Yes, this is reinforcement learning in the established use of the term. You may run into some opposition from those doing active research today, as the "hot" portions deal with deep learning applications.
Your application has a well-defined game tree to search; you can direct the reinforcements with a mathematical structure that corresponds directly to the game. This is a machine learning application, along well-established learning algorithms.
Current "hot" research is working with more complex game situations in which the correspondence between an action and its result is not well-defined. These video games use DL networks rather than game trees in an effort to eventually discover the action rules that will lead to higher success. They're solidly in the DL part of AI, which is why you see a partitioning in the things you read.
I want to develop RISK board game, which will include an AI for computer players. Moreovor, I read two articles, this and this, about it, and I realised that I must learn about Monte Carlo simulation and Markov chains techniques. And I thought that I have to use these techniques together, but I guess they are different techniques relevant to calculate probabilities about transition states.
So, could anyone explain what are the important differences and advantages and disadvantages between them?
Finally, which way you will prefer if you would implement an AI for RISK game?
Here you can find simple determined probabilities about outcomes of a battle in risk board game, and the brute force algorithm used. There is a tree diagram to which specifies all possible states. Should I use Monte Carlo or Markov chain on this tree?
Okay, so I skimmed the articles to get a sense of what they were doing. Here is my overview of the terms you asked about:
A Markov Chain is simply a model of how your system moves from state to state. Developing a Markov model from scratch can sometimes be difficult, but once you have one in hand, they're relatively easy to use, and relatively easy to understand. The basic idea is that your game will have certain states associated with it; that as part of the game, you will move from state to state; and, critically, that this movement from state to state happens based on probability and that you know these probabilities.
Once you have that information, you can represent it all as a graph, where nodes are states and arcs between states (labelled with probabilities) are transitions. You can also represent as a matrix that satisfies certain constraints, or several other more exotic data structures.
The short article is actually all about the Markov Chain approach, but-- and this is important-- it is using that approach only as a quick means of estimating what will happen if army A attacks a territory with army B defending it. It's a nice use of the technique, but it is not an AI for Risk, it is merely a module in the AI helping to figure out the likely results of an attack.
Monte Carlo techniques, by contrast, are estimators. Once you have a model of something, whether a Markov model or any other, you often find yourself in the position of wanting to estimate something about it. (Often it happens to be something that you can, if you squint hard enough, put into the form of an integral of something that happens to be very hard to work with in that format.) Monte Carlo techniques just sample randomly and aggregate the results into an estimate of what's going to happen.
Monte Carlo techniques are not, in my opinion, AI techniques. They are a very general purpose technique that happen to be useful for AI, but they are not AI per se. (You can say the same thing about Markov models, but the statement is weaker because Markov models are so extremely useful for planning in AI, that entire philosophies of AI are built around the technique. Markov models are used elsewhere, though, as well.)
So that is what they are. You also asked which one I would use if I had to implement a Risk AI. Well, neither of those is going to be sufficient. Monte Carlo, as I said, is not an AI technique, it's a general math tool. And Markov Models, while they could in theory represent the entirety of a game of Risk, are going to end up being very unwieldy: You would need to represent every state of the game, meaning every possible configuration of armies in territories and every possible configuration of cards in hands, etc. (I am glossing over many details, here: There are a lot of other difficulties with this approach.)
The core of Wolf's thesis is neither the Markov approach nor the Monte Carlo approach, it is actually what he describes as the evaluation function. This is the heart of the AI problem: How to figure out what action is best. The Monte Carlo method in Blatt's paper describes a method to figure out what the expected result of an action is, but that is not the same as figuring out what action is best. Moreover, there's a low key statement in Wolf's thesis about look-ahead being hard to perform in Risk because the game trees become so very large, which is what led him (I think) to focus so much on the evaluation function.
So my real advice would be this: Read up on search-tree approaches, like minimax, alpha-beta pruning and especially expecti-minimax. You can find good treatments of these early in Russell and Norvig, or even on Wikipedia. Try to understand why these techniques work in general, but are cumbersome for Risk. That will lead you to some discussion of board evaluation techniques. Then go back and look at Wolf's thesis, focusing on his action evaluation function. And finally, focus on the way he tries to automatically learn a good evaluation function.
That is a lot of work. But Risk is not an easy game to develop an AI for.
(If you just want to figure out the expected results of a given attack, though, I'd say go for Monte Carlo. It's clean, very intuitive to understand, and very easy to implement. The only difficult-- and it's not a big one-- is making sure you run enough trials to get a good result.)
Markov chains are simply a set of transitions and their probabilities, assuming no memory of past events.
Monte Carlo simulations are repeated samplings of random walks over a set of probabilities.
You can use both together by using a Markov chain to model your probabilities and then a Monte Carlo simulation to examine the expected outcomes.
For Risk I don't think I would use Markov chains because I don't see an advantage. A Monte Carlo analysis of your possible moves could help you come up with a pretty good AI in conjunction with a suitable fitness function.
I know this glossed over a lot but I hope it helped.
I have a good basis on Evolutionary Algorithms, so now i started to read about Artificial Neural Networks. I come across this tutorial on
http://www.ai-junkie.com/ann/evolved/nnt2.html,
showing how to use a ANN to evolve Tanks that collect mines. It uses a GA to evolve the input weights on each Neuron.
I know i could use GA (without the ANN) to solve the same problem. I already created a Tetris Bot using only GA to optimize the weights in the grid evaluation function (check my blog http://www.bitsrandomicos.blogspot.com.br/).
My question is: what's the conceptual/practical difference between using a ANN + GA in a situation where i could use GA alone? I mean, is my Tetris Bot a ANN?(I don't think so).
There are several related questions about this, but i couldn't find a answer:
Are evolutionary algorithms and neural networks used in the same domains?
When to use Genetic Algorithms vs. when to use Neural Networks?
Thanks!
A genetic algorithm is an optimization algorithm.
An artificial neural network is a function approximator. In order to approximate a function you need an optimization algorithm to adjust the weights. An ANN can be used for supervised learning (classification, regression) or reinforcement learning and some can even be used for unsupervised learning.
In supervised learning a derivative-free optimization algorithm like a genetic algorithm is slower than most of the optimization algorithms that use gradient information. Thus, it only makes sense to evolve neural networks with genetic algorithms in reinforcement learning. This is known as "neuroevolution". The advantage of neural networks like multilayer perceptrons in this setup is that they can approximate any function with arbitrary precision when they have a suffficient number of hidden nodes.
When you create a tetris bot you do not necessarily have to use an ANN as a function approximator. But you need some kind of function approximator to represent your bot's policy. I guess it was just simpler than an ANN. But when you want to create a complex nonlinear policy you could do that e. g. with an ANN.
alfa's answer is perfect. Here is just an image to illustrate what he said:
Meta-Optimizer = None (but could be)
Optimizer = Genetic Algorithm
Problem = Tetris Bot (e.g. ANN)
You use evolutionary algorithm if you yet don't know the answer but you are able to somehow rate candidates and provide meaningful mutations.
Neural network is great if you already have answers (and inputs) and you want to "train the computer" so it can "guess" the answers for unknown inputs. Also, you don't have to think a lot about the problem, the network will figure it out by itself.
Check this "game AI" example: https://synaptic.juancazala.com/#/
(note how simple it is, all you have to do is to give them enough training, you don't have to know a thing about game AI - and once it is good enough all you have to do is to "download" memory and run it when needed)
I'm not an expert, but based on what I know from the field..
An artificial neural network has a basis on neuroscience ultimately. It attempts to simulate/model its behavior through building a neuron-like structures in the algorithm. There is a strong emphasis on the academic nature of the problem than the result. From what I understand, its for this reason that ANN's are not very popular from an engineering standpoint. Statistical basis of machine learning (HMM's and Bayesian networks) produce better results.
In short, so as long as it has a nod towards some underlying neurosciency subject, it can be a ANN, even if it uses some form of GA.
If you use a GA, it is not necessarily an ANN.
This question is related to 873448.
From Wikipedia:
The Blue Brain Project is an attempt to create a synthetic brain by reverse-engineering the mammalian brain down to the molecular level. [...] Using a Blue Gene supercomputer running Michael Hines's NEURON software, the simulation does not consist simply of an artificial neural network, but involves a biologically realistic model of neurons.
"If we build it correctly it should speak and have an intelligence and behave very much as a human does."
My question is how the software works internally. If it "involves a biologically realistic model of neurons", how is that different from a neural network, and why can't neural networks simulate a biological brain well while this project would be able to? And, how is NEURON software used in the simulation?
Lastly, I apologize if this question doesn't belong here (maybe the BioStar StackExchance would be a better place to ask).
NEURON software models neuronal cells by modeling fluxes of ions inside and outside the cell through different ion channels. These movement generate a difference of electrical potential between the interior and the exterior of the neuronal membrane, and modulations of this potential allows different neurons to communicate between each other. Several biophysical models for neurons exist, such as the integrate-and-fire model or the Hodgkin-Huxley model
Artificial neural networks have pretty much nothing to do with biological neural networks, apart from sharing the same name. They're mathematical constructs that are connected with each other in a weighted manner, allowing to take one or more inputs and produce one or more outputs.
EDIT: I have to add, as much as the Blue Project is an incredible and very admirable step towards modeling an entire brain, we are far far far far away from that goal. All these are models, so they approximate the behaviour of biological cells, but they are in no way complete. Furthermore, there is a high bias in the "choice" of which neurons these models analyze. Most of the models represent certain areas of the brain (such as the cortex or the hippocampus) of which 1) we have quite a bit of knowledge and 2) are constituted by very organized structures of neuronal cells working together. Other parts of the brain may not be as trivial to model (note that I use "trivial" in a jokingly way, I'm not in any way saying that modeling the cortex is easy!), but I guess the details of this would be a bit outside the scope of SO. Maybe when the cognitive science proposal will be operative you could pose the question there!
Finally, to correct the quoted statement, the project did model a column of the somatosensory cortex of the rat, which is only a very tiny part of an entire rat brain.
Just wondering, since we've reached 1 teraflop per PC, yet we are still not able to model an insect's brain.
Has anyone seen a decent implementation of a self-learning, self-developing neural network?
I saw an interesting experiment mapping the physical neural layout of a rat's brain to a digital neural network with weighting modelled on the neuron chemistry of each component taken using MRI and others. Quite interesting. (new scientist or Focus, 2 issues ago?)
IBM Blue Brain comes to mind
http://news.bbc.co.uk/1/hi/sci/tech/8012496.stm
The problem is computation power as you rightly point out. But for a sequence of stimuli to a neural network the range of calculations tends to be exponential as that stimuli encounters deeper nested nodes. Any complex weighting algorithm means that time spent at each node can get expensive. Domain specific neural-maps tend to be quicker because they are specialized. Brains in mammals have many general paths, making it harder to teach them, and for a computer to model a real mammal brain in a given space/time.
Real brains also have tons of cross-talk like static (some people think this is where creativity or original thought stems from). Brains also don't learn using 'direct' stimulus/reward ... they use past experience of non-related matter to create their own learning. Recreating the neurons is one thing in a computational space, creating an accurate learning is another. Never-mind the dopamine (octopamine in insects) and other neurological chemicals.
imagine giving a digital brain LSD or anti-depressants. As a real simulation. Awesome. That would be a complex simulation I suspect.
I think you're kind of making the assumption that our idea of how neural networks work is a good model for the brain at a large-scale level; I'm not sure that is a good assumption. Hell, not too many years ago we didn't think the glial cells were important to mental functions, and it was the idea for a long time that there is no neurogenesis after the brain matures.
On the other hand, neural networks do seem to handle some apparently complex functions pretty well.
So, here's a little puzzle question for you: how many teraflops or petaflops do you think a human brain's computation represents?
Jeff Hawkins would say that a neural net is a poor approximation of a brain. His "On Intelligence" is a terrific read.
Yup: OpenCog is working on it.
It's the structure. Even if we had computers today with the same or higher performance than a human brain (there are different predictions when we'll get there, but there are still a few years to go), we still need to program it. And while we know a lot of the brain today, there are still many, many more things we do not know. And these aren't just details, but large areas that are not understood at all.
Focusing only on the Tera-/Peta-FLOPS is like looking only at megapixels with digital cameras: it focuses on only one value when there are many factors involved (and there are a few more of those in a brain than in a camera). I also believe that many of the estimates just how many FLOPS would be needed to simulate a brain are way off - but that's a different discussion altogether.
Just wondering, we've reached 1 teraflop per PC, and we are still not able to model an insect's brain. has anyone seen a decent implementation of a self-learning self-developing neural network?
We can already model brains. The question these days, is how fast, and how accurate.
In the beginning, there was effort expended on trying to find the most abstract representation of neurons with the least amount of physical properties needed.
This led to the invention of the perceptron at Cornell University, which is a very simple model indeed. In fact, it may have been too simple, as the famous MIT AI professor, Marvin Minsky, wrote a paper which mistakenly concluded that it would be impossible for this type of model to learn XOR (a basic logic gate that could be emulated by every computer we have today). Unfortunately, his paper plunged neural network research into the dark ages for at least 10 years.
While probably not as impressive as many would like, there are learning networks that are already in existence that can do visual and speech learning and recognition.
And even though we have faster CPUs, it is still not the same as a neuron. Neurons in our brain are, at the very least, parallel adder units. So imagine 100 billion simulated human neurons, adding each second, sending their outputs to 100 trillion connections with a "clock" of about 20hz. The amount of computation going on here far exceeds the petaflops of processing power we have, especially when our cpus are mostly serial instead of parallel.
In 2007, they simulated the equivalent of a half mouse brain for 10 seconds at half the actual speed: http://news.bbc.co.uk/1/hi/technology/6600965.stm
There is a worm named C. Elegance and its anatomy is completely know to us. Every cell is mapped out and every neuron is well studied. This worm has an interesting property by birth and that is it follows or grow towards only those temperature regions in which it was born. Here is link to the paper. This paper has implementation of the property with neuronal model. And there are some students who have built robot that only follows dark regions in the region having different shades of light, using this neuronal model. This work could have been done using other methods as well but this method is more noise resilient as proved by paper to which I have given link above.