classical AI, ontology, machine learning, bayesian - artificial-intelligence

I'm starting to study machine learning and bayesian inference applied to computer vision and affective computing.
If I understand right, there is a big discussion between
classical IA, ontology, semantic web researchers
and machine learning and bayesian guys
I think it is usually referred as strong AI vs weak AI related also to philosophical issues like functional psychology (brain as black box set) and cognitive psychology (theory of mind, mirror neuron), but this is not the point in a programming forum like this.
I'd like to understand the differences between the two points of view. Ideally, answers will reference examples and academic papers where one approach get good results and the other fails. I am also interested in the historical trends: why approaches fell out of favour and a newer approaches began to rise up. For example, I know that Bayesian inference is computationally intractable, problem in NP, and that's why for a long time probabilistic models was not favoured in information technology world. However, they've began to rise up in econometrics.

I think you have got several ideas mixed up together. It's true that there is a distinction that gets drawn between rule-based and probabilistic approaches to 'AI' tasks, however it has nothing to do with strong or weak AI, very little to do with psychology and it's not nearly as clear cut as being a battle between two opposing sides. Also, I think saying Bayesian inference was not used in computer science because inference is NP complete in general is a bit misleading. That result often doesn't matter that much in practice and most machine learning algorithms don't do real Bayesian inference anyway.
Having said all that, the history of Natural Language Processing went from rule-based systems in the 80s and early 90s to machine learning systems up to the present day. Look at the history of the MUC conferences to see the early approaches to information extraction task. Compare that with the current state-of-the-art in named entity recognition and parsing (the ACL wiki is a good source for this) which are all based on machine learning methods.
As far as specific references, I doubt you'll find anyone writing an academic paper that says 'statistical systems are better than rule-based systems' because it's often very hard to make a definite statement like that. A quick Google for 'statistical vs. rule based' yields papers like this which looks at machine translation and recommends using both approaches, according to their strengths and weaknesses. I think you'll find that this is pretty typical of academic papers. The only thing I've read that really makes a stand on the issue is 'The Unreasonable Effectiveness of Data' which is a good read.

As for the "rule-based" vs. " probabilistic" thing you can go for the classic book by Judea Pearl - "Probabilistic Reasoning in Intelligent Systems. Pearl writes very biased towards what he calls "intensional systems" which is basically the counter-part to rule-based stuff. I think this book is what set off the whole probabilistic thing in AI (you can also argue the time was due, but then it was THE book of that time).
I think machine-learning is a different story (though it's nearer to probabilistic AI than to logics).

Related

Reference Request: Books containing examples of Applied Theory/Math

I recently came across a blog post that introduced the term "Bayesian Spam Filtering" and talked about how this was the approach behind spam filtering for emails.
I also remember a paper (perhaps it was this?) discussing how Game Theory is involved in packet routing, or how it used for Resource Allocation in Cloud Computing. Also, I recall a university course on Formal Methods, and how they're used in Software Engineering.
I am looking for books which talk about how concepts from Mathematics or CS Theory are actually applied in every day technology.
Any good textbook will at least mention and ideally discuss everyday applications of the subject matter whenever possible. For example, the excellent Artificial Intelligence: A Modern Approach by Russell and Norvig does this throughout.
More specialized books will of course tend to have deeper discussions of the applications. What is one 30-page chapter in Russell and Norvig is a whole book by Sutton and Barto: Reinforcement Learning: An Introduction (free pdf at the link, courtesy of the authors). Have a look at chapter 16 for some fascinating applications.
There are also books that start with an application, i.e. a practical problem, and then develop all the theory necessary to solve it. One of my favorites in this category is In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation by William Cook.
You might be interested in Math for Programmers which has some interesting approaches to programming by pure math. In retrospect, A Programmer's Introduction to Mathematics teaches math using programming, which might be helpful if you want to learn it backwards.

How to design the artificial intelligence of a fighting game (Street Fighter or Soul Calibur)?

There are many papers about ranged combat artificial intelligences, like Killzones's (see this paper), or Halo. But I've not been able to find much about a fighting IA except for this work, which uses neural networs to learn how to fight, which is not exactly what I'm looking for.
Occidental AI in games is heavily focused on FPS, it seems! Does anyone know which techniques are used to implement a decent fighting AI? Hierarchical Finite State Machines? Decision Trees? They could end up being pretty predictable.
In our research labs, we are using AI planning technology for games. AI Planning is used by NASA to build semi-autonomous robots. Planning can produce less predictable behavior than state machines, but planning is a highly complex problem, that is, solving planning problems has a huge computational complexity.
AI Planning is an old but interesting field. Particularly for gaming only recently people have started using planning to run their engines. The expressiveness is still limited in the current implementations, but in theory the expressiveness is limited "only by our imagination".
Russel and Norvig have devoted 4 chapters on AI Planning in their book on Artificial Intelligence. Other related terms you might be interested in are: Markov Decision Processes, Bayesian Networks. These topics are also provided sufficient exposure in this book.
If you are looking for some ready-made engine to easily start using, I guess using AI Planning would be a gross overkill. I don't know of any AI Planning engine for games but we are developing one. If you are interested in the long term, we can talk separately about it.
You seem to know already the techniques for planning and executing. Another thing that you need to do is predict the opponent's next move and maximize the expected reward of your response. I wrote a blog article about this: http://www.masterbaboon.com/2009/05/my-ai-reads-your-mind-and-kicks-your-ass-part-2/ and http://www.masterbaboon.com/2009/09/my-ai-reads-your-mind-extensions-part-3/ . The game I consider is very simple, but I think the main ideas from Bayesian decision theory might be useful for your project.
I have reverse engineered the routines related to the AI subsystem within the Street Figher II series of games. It does not incorporate any of the techniques mentioned above. It is entirely reactive and involves no planning, learning or goals. Interestingly, there is no "technique weight" system that you mention, either. They don't use global weights for decisions to decide the frequency of attack versus block, for example. When taking apart the routines related to how "difficulty" is made to seem to increase, I did expect to find something like that. Alas, it relates to a number of smaller decisions that could potentially affect those ratios in an emergent way.
Another route to consider is the so called Ghost AI as described here & here. As the name suggests you basically extract rules from actual game play, first paper does it offline and the second extends the methodology for online real time learning.
Check out also the guy's webpage, there are a number of other papers on fighting games that are interesting.
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/index-R.html
its old but here are some examples

Applications for the Church Programming Language

Has anyone worked with the programming language Church? Can anyone recommend practical applications? I just discovered it, and while it sounds like it addresses some long-standing problems in AI and machine-learning, I'm skeptical. I had never heard of it, and was surprised to find it's actually been around for a few years, having been announced in the paper Church: a language for generative models.
I'm not sure what to say about the matter of practical applications. Does modeling cognitive abilities with generative models constitute a "practical application" in your mind?
The key importance of Church (at least right now) is that it allows those of us working with probabilistic inference solutions to AI problems a simpler way to model. It's essentially a subset of Lisp.
I disagree with Chris S that it is at all a toy language. While some of these inference problems can be replicated in other languages (I've built several in Matlab) they generally aren't very reusable and you really have to love working in 4 and 5 for loops deep (I hate it).
Instead of tackling the problem that way, Church uses the recursive advantages of lamda calaculus and also allows for something called memoization which is really useful for generative models since your generative model is often not the same one trial after trial--though for testing you really need this.
I would say that if what you're doing has anything to do with Bayesian Networks, Hierarchical Bayesian Models, probabilistic solutions to POMDPs or Dynamic Bayesian Networks then I think Church is a great help. For what it's worth, I've worked with both Noah and Josh (two of Church's authors) and no one has a better handle on probabilistic inference right now (IMHO).
Church is part of the family of probabilistic programming languages that allows the separation of the estimation of a model from its definition. This makes probabilistic modeling and inference a lot more accessible to people that want to apply machine learning but who are not themselves hardcore machine learning researchers.
For a long time, probabilistic programming meant you'd have to come up with a model for your data and derive the estimation of the model yourself: you have some observed values, and you want to learn the parameters. The structure of the model is closely related to how you estimate the parameters, and you'd have to be pretty advanced knowledge of machine learning to do the computations correctly. The recent probabilistic programming languages are an attempt to address that and make things more accessible for data scientists or people doing work that applies machine learning.
As an analogy, consider the following:
You are a programmer and you want to run some code on a computer. Back in the 1970s, you had to write assembly language on punch cards and feed them into a mainframe (for which you had to book time on) in order to run your program. It is now 2014, and there are high-level, simple to learn languages that you can write code in even with no knowledge of how computer architecture works. It's still helpful to understand how computers work to write in those languages, but you don't have to, and many more people write code than if you had to program with punch cards.
Probabilistic programming languages do the same for machine learning with statistical models. Also, Church isn't the only choice for this. If you aren't a functional programming devotee, you can also check out the following frameworks for Bayesian inference in graphical models:
Infer.NET, written in C# by the Microsoft Research lab in Cambridge, UK
stan, written in C++ by the Statistics department at Columbia
You know what does a better job of describing Church than what I said? This MIT article: http://web.mit.edu/newsoffice/2010/ai-unification.html
It's slightly more hyperbolic, but then, I'm not immune to the optimism present in this article.
Likely, the article was intended to be published on April Fool's Day.
Here's another article dated late march of last year. http://dspace.mit.edu/handle/1721.1/44963

Less Mathematical Approaches to Machine Learning?

Out of curiosity, I've been reading up a bit on the field of Machine Learning, and I'm surprised at the amount of computation and mathematics involved. One book I'm reading through uses advanced concepts such as Ring Theory and PDEs (note: the only thing I know about PDEs is that they use that funny looking character). This strikes me as odd considering that mathematics itself is a hard thing to "learn."
Are there any branches of Machine Learning that use different approaches?
I would think that a approaches relying more on logic, memory, construction of unfounded assumptions, and over-generalizations would be a better way to go, since that seems more like the way animals think. Animals don't (explicitly) calculate probabilities and statistics; at least as far as I know.
The behaviour of the neurons in our brains is very complex, and requires some heavy duty math to model. So, yes we do calculate extremely complex math, but it's done in a way that we don't perceive.
I don't know whether the math you typically find in A.I. research is entirely due to the complexity of the natural neural systems being modelled. It may also be due, in part, to heuristic techniques that don't even attempt to model the mind (e.g., using convolution filters to recognise shapes).
If you want to avoid the math but do AI like stuff, you can always stick to simpler models. In 90% of the time, the simpler models will be good enough for real world problems.
I don't know of a track of AI that is completely decoupled from math though. Probability theory is the tool for handling uncertainty which plays a major role in AI. So even if there was not-so-mathematical subfield, math techniques would most be a way to improve on those methods. And thus the mathematics would be back in game. Even simple techniques like the naive Bayes and decision trees can be used without a lot of math, but the real understanding comes only through it.
Machine learning is very math heavy. It is sometimes said to be close to "computational statistics", with a little more focus on the computational side. You might want to check out "Collective Intelligence" by O'Reilly, though. I hear they have a good collection of ML techniques without math too hard.
You might find evolutionary computing approaches to machine learning a little less front-loaded with heavy-duty maths, approaches such as ant-colony optimisation or swarm intelligence.
I think you should put to one side, if you hold it as your question kind of suggests you do, the view that machine learning is trying to simulate what goes on in the brains of animals (including Homo Sapiens). A lot of the maths in modern machine learning arises from its basis in pattern recognition and matching; some of it comes from attempts to represent what is learnt as quasi-mathematical statements; some of it comes from the need to use statistical methods to compare different algorithms and approaches. And some of it comes because some of the leading practitioners come from scientific and mathematical backgrounds rather than computer science backgrounds, and they bring their toolset with them when they come.
And I'm very surprised that you are suprised that machine learning involves a lot of computation since the long history of AI has proven that it is extremely difficult to build machines which (seem to) think.
I've been thinking about this kind of stuff a lot lately.
Unfortunately, most engineer/mathematician types are so tied to their own familiar mathematical/computational worlds, they often forget to consider other paradigms.
Artists, for example, often think of the world in a very fluid way, usually untethered by mathematical models. Much of what happens in art is archetypal or symbolic, and often doesn't follow any seemingly conventional logical arrangement. There are, of course, very strong exceptions to this. Music, for instance, especially in music theory, often requires strong left brained processes and so forth. In truth, I would argue that even the most right brained activities are not devoid of left logic, but rather are more complex mathematical paradigms, like chaos theory is to the beauty of fractals. So the cross-over from left to right and back again is not a schism, but a symbiotic coupling. Humans utilize both sides of the brain.
Lately I've been thinking about a more artful representational approach to math and machine language -- even in a banal world of ones and zeroes. The world has been thinking about machine language in terms of familiar mathematical, numeric, and alphabetic conventions for a fairly long time now, and it's not exactly easy to realign the cosmos. Yet in a way, it happens naturally. Wikis, wysisygs, drafting tools, photo and sound editors, blogging tools, and so forth, all these tools do the heavy mathematical and machine code lifting behind the scenes to make for a more artful end experience for the user.
But we rarely think of doing the same lifting for coders themselves. To be sure, code is symbolic, by its very nature, lingual. But I think it is possible to turn the whole thing on its head, and adopt a visual approach. What this would look like is anyone's guess, but in a way we see it everywhere as the whole world of machine learning is abstracted more and more over time. As machines become more and more complex and can do more and more sophisticated things, there is a basic necessity to abstract and simplify those very processes, for ease of use, design, architecture, development, and...you name it.
That all said, I do not believe machines will ever learn anything on their own without human input. But that is another debate, as to the character of religion, God, science, and the universe.
I attended a course in machine-learning last semester. The cognitive science chair at our university is very interested in symbolic machine learning (That's the stuff without mathematics or statistics ;o)). I can recommend two outstanding textbooks:
Machine Learning (Thomas Mitchell)
Artificial Intelligence: A Modern Approach (Russel and Norvig)
The first one is more focused on machine learning, but its very compact has got a very gentle learning curve. The second one is a very interesting read with many historical informations.
These two titles should give you a good overview (All aspects of machine learning not just symbolic approaches), so that you can decide for yourself which aspect you want to focus on.
Basically there is always mathematics involved but I find symbolic machine learning easier to start with because the ideas behind most approaches are often amazingly simple.
Mathematics is simply a tool in machine learning. Knowing the maths enables one to efficiently approach the modelled problems at hand. Of course it might be possible to brute force one's way through, but usually this would come with the expense of lessened understanding of the basic principles involved.
So, pick up a maths book, study the topics until it you're familiar with the concepts. No mechanical engineer is going to design a bridge without understanding the basic maths behind the system behaviour; why should this be any different in the area of machine learning?
There is a lot of stuff in Machine Learning, outside just the math..
You can build the most amazing probabilistic model using a ton of math, but fail because you aren't extracting the right features from the data (which might often require domain insight) or are having trouble figuring out what your model is failing on a particular dataset (which requires you to have a high-level understanding of what the data is giving, and what the model needs).
Without the math, you cannot build new complicated ML models by yourself, but you sure can play with existing tried-and-tested ones to analyze information and do cool things.
You still need some math knowledge to interpret the results the model gives you, but this is usually a lot easier than having to build these models on your own.
Try playing with http://www.cs.waikato.ac.nz/ml/weka/ and http://mallet.cs.umass.edu/ .. The former comes with all the standard ML algorithms along with a lot of amazing features that enable you to visualize your data and pre/post-process it to get good results.
Yes, machine learning research is now dominated by researchers trying to solve the classification problem: given positive/negative examples in an n-dimensional space, what is the best n-dimensional shape that captures the positive ones.
Another approach is taken by case-based reasoning (or case-based learning) where deduction is used alongside induction. The idea is that your program starts with a lot of knowledge about the world (say, it understands Newtonian physics) and then you show it some positive examples of the desired behavior (say, here is how the robot should kick the ball under these circumstances) then the program uses these together to extrapolate the desired behavior to all circumstances. Sort of...
firstly cased based AI, symbolic AI are all theories.. There are very few projects that have employed them in a sucessfull manner. Nowadays AI is Machine Learning. And even neural nets are also a core element in ML, which uses a gradient based optimization. U wanna do Machine learning, Linear Algebra, Optimization, etc is a must..

How to design and verify distributed systems?

I've been working on a project, which is a combination of an application server and an object database, and is currently running on a single machine only. Some time ago I read a paper which describes a distributed relational database, and got some ideas on how to apply the ideas in that paper to my project, so that I could make a high-availability version of it running on a cluster using a shared-nothing architecture.
My problem is, that I don't have experience on designing distributed systems and their protocols - I did not take the advanced CS courses about distributed systems at university. So I'm worried about being able to design a protocol, which does not cause deadlock, starvation, split brain and other problems.
Question: Where can I find good material about designing distributed systems? What methods there are for verifying that a distributed protocol works right? Recommendations of books, academic articles and others are welcome.
I learned a lot by looking at what is published about really huge web-based plattforms, and especially how their systems evolved over time to meet their growth.
Here a some examples I found enlightening:
eBay Architecture: Nice history of their architecture and the issues they had. Obviously they can't use a lot of caching for the auctions and bids, so their story is different in that point from many others. As of 2006, they deployed 100,000 new lines of code every two weeks - and are able to roll back an ongoing deployment if issues arise.
Paper on Google File System: Nice analysis of what they needed, how they implemented it and how it performs in production use. After reading this, I found it less scary to build parts of the infrastructure myself to meet exactly my needs, if necessary, and that such a solution can and probably should be quite simple and straight-forward. There is also a lot of interesting stuff on the net (including YouTube videos) on BigTable and MapReduce, other important parts of Google's architecture.
Inside MySpace: One of the few really huge sites build on the Microsoft stack. You can learn a lot of what not to do with your data layer.
A great start for finding much more resources on this topic is the Real Life Architectures section on the "High Scalability" web site. For example they a good summary on Amazons architecture.
Learning distributed computing isn't easy. Its really a very vast field covering areas on communication, security, reliability, concurrency etc., each of which would take years to master. Understanding will eventually come through a lot of reading and practical experience. You seem to have a challenging project to start with, so heres your chance :)
The two most popular books on distributed computing are, I believe:
1) Distributed Systems: Concepts and Design - George Coulouris et al.
2) Distributed Systems: Principles and Paradigms - A. S. Tanenbaum and M. Van Steen
Both these books give a very good introduction to current approaches (including communication protocols) that are being used to build successful distributed systems. I've personally used the latter mostly and I've found it to be an excellent text. If you think the reviews on Amazon aren't very good, its because most readers compare this book to other books written by A.S. Tanenbaum (who IMO is one of the best authors in the field of Computer Science) which are quite frankly better written.
PS: I really question your need to design and verify a new protocol. If you are working with application servers and databases, what you need is probably already available.
I liked the book Distributed Systems: Principles and Paradigms by Andrew S. Tanenbaum and Maarten van Steen.
At a more abstract and formal level, Communicating and Mobile Systems: The Pi-Calculus by Robin Milner gives a calculus for verifying systems. There are variants of pi-calculus for verifying protocols, such as SPI-calculus (the wikipedia page for which has disappeared since I last looked), and implementations, some of which are also verification tools.
Where can I find good material about designing distributed systems?
I have never been able to finish the famous book from Nancy Lynch. However, I find that the book from Sukumar Ghosh Distributed Systems: An Algorithmic Approach is much easier to read, and it points to the original papers if needed.
It is nevertheless true that I didn't read the books from Gerard Tel and Nicola Santoro. Perhaps they are still easier to read...
What methods there are for verifying that a distributed protocol works right?
In order to survey the possibilities (and also in order to understand the question), I think that it is useful to get an overview of the possible tools from the book Software Specification Methods.
My final decision was to learn TLA+. Why? Even if the language and tools seem better, I really decided to try TLA+ because the guy behind it is Leslie Lamport. That is, not just a prominent figure on distributed systems, but also the author of Latex!
You can get the TLA+ book and several examples for free.
There are many classic papers written by Leslie Lamport :
(http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html) and Edsger Dijkstra
(http://www.cs.utexas.edu/users/EWD/)
for the database side.
A main stream is NoSQL movement,many project are appearing in the market including CouchDb( couchdb.apache.org) , MongoDB ,Cassandra. These all have the promise of scalability and managability (replication, fault tolerance, high-availability).
One good book is Birman's Reliable Distributed Systems, although it has its detractors.
If you want to formally verify your protocol you could look at some of the techniques in Lynch's Distributed Algorithms.
It is likely that whatever protocol you are trying to implement has been designed and analysed before. I'll just plug my own blog, which covers e.g. consensus algorithms.

Resources