Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
How would one design a neural network for the purpose of a recommendation engine. I assume each user would require their own network, but how would you design the inputs and the outputs for recommending an item in a database. Are there any good tutorials or something?
Edit: I was more thinking how one would design a network. As in how many input neurons and how the output neurons point to a record in a database. Would you have say 6 output neurons, convert it to an integer (which would be anything from 0 - 63) and that is the ID of the record in the database? Is that how people do it?
I would suggest looking into neural networks using unsupervised learning such as self organising maps. It's very difficult to use normal supervised neural networks to do what you want unless you can classify the data very precisely for learning. self organising maps don't have this problem because the network learns the classification groups all on their own.
have a look at this paper which describes a music recommendation system for music
http://www.springerlink.com/content/xhcyn5rj35cvncvf/
and many more papers written about the topic from google scholar
http://www.google.com.au/search?q=%09+A+Self-Organizing+Map+Based+Knowledge+Discovery+for+Music+Recommendation+Systems+&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a&safe=active
First you have to decide what exactly you are recommending and under what circumstances. There are many things to take into account. Are you going to consider the "other users who bought X also bought Y?" Are you going to only recommend items that have a similar nature to each other? Are you recommending items that have a this-one-is-more-useful-with-that-one type of relationship?
I'm sure there are many more decisions, and each one of them has their own goals in mind. It would be very difficult to train one giant network to handle all of the above.
Neural networks all boil down to the same thing. You have a given set of inputs. You have a network topology. You have an activation function. You have weights on the nodes' inputs. You have outputs, and you have a means to measure and correct error. Each type of neural network might have its own way of doing each of those things, but they are present all the time (to my limited knowledge). Then, you train the network by feeding in a series of input sets that have known output results. You run this training set as much as you'd like without over or under training (which is as much your guess as it is the next guy's), and then you're ready to roll.
Essentially, your input set can be described as a certain set of qualities that you believe have relevance to the underlying function at hand (for instance: precipitation, humidity, temperature, illness, age, location, cost, skill, time of day, day of week, work status, and gender may all have an important role in deciding whether or not person will go golfing on a given day). You must therefore decide what exactly you are trying to recommend and under what conditions. Your network inputs can be boolean in nature (0.0 being false and 1.0 being true, for instance) or mapped in a pseudo-continuous space (where 0.0 may mean not at all, .45 means somewhat, .8 means likely, and 1.0 means yes). This second option may give you the tools to map confidence level for a certain input, or simple a math calculation you believe is relevant.
Hope this helped. You didn't give much to go on :)
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What does number of hidden layers in a multilayer perceptron neural network do to the way neural network behaves? Same question for number of nodes in hidden layers?
Let's say I want to use a neural network for hand written character recognition. In this case I put pixel colour intensity values as input nodes, and character classes as output nodes.
How would I choose number of hidden layers and nodes to solve such problem?
Note: this answer was correct at the time it was made, but has since become outdated.
It is rare to have more than two hidden layers in a neural network. The number of layers will usually not be a parameter of your network you will worry much about.
Although multi-layer neural networks with many layers can represent
deep circuits, training deep networks has always been seen as somewhat
of a challenge. Until very recently, empirical studies often found
that deep networks generally performed no better, and often worse,
than neural networks with one or two hidden layers.
Bengio, Y. & LeCun, Y., 2007. Scaling learning algorithms towards AI. Large-Scale Kernel Machines, (1), pp.1-41.
The cited paper is a good reference for learning about the effect of network depth, recent progress in teaching deep networks, and deep learning in general.
The general answer is to for picking hyperparameters is to cross-validate. Hold out some data, train the networks with different configurations, and use the one that performs best on the held out set.
Most of the problems I have seen were solved with 1-2 hidden layers. It is proven that MLPs with only one hidden layer are universal function approximators (Hornik et. al.). More hidden layers can make the problem easier or harder. You usually have to try different topologies. I heard that you cannot add an arbitrary number of hidden layers if you want to train your MLP with backprop because the gradient will become too small in the first layers (I have no reference for that). But there are some applications where people used up to nine layers. Maybe you are interested in a standard benchmark problem which is solved by different classifiers and MLP topologies.
Besides the fact that cross-validation on different model configurations(no. of hidden layers OR neurons per layer) will lead you to choose better configuration.
One approach is training a model, as big and deep as possible and use dropout regularization to turn off some neurons and reduce overfitting.
the reference to this approach can be seen in this paper.
https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf
All the above answers are of course correct but just to add some more ideas:
Some general rules are the following based on this paper: 'Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture' by Saurabh Karsoliya
In general:
The number of hidden layer neurons are 2/3 (or 70% to 90%) of the size of the input layer. If this is insufficient then number of output layer neurons can be added later on.
The number of hidden layer neurons should be less than twice of the number of neurons in input layer.
The size of the hidden layer neurons is between the input layer size and the output layer size.
Keep always in mind that you need to explore and try a lot of different combinations. Also, using GridSearch you could find the "best model and parameters".
E.g. we can do a GridSearch in order to determine the "best" size of the hidden layer.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have written an artifical neural network (ANN) implementation for myself (it was fun). I am thinking now about where can I use it.
What are the key areas in the real world, where ANN is being used?
ANNs are an example of a "learning" system, one that "trains" on input data (in some domain) in order to effectively classify (unseen) data in that domain. They've been used for everything from character recognition to computer games and beyond.
If you're trying to find a domain, pick some topic or field that interests you, and see what kinds of classification problems exist there.
Most often for classifying noisy inputs into fixed categories, like handwritten letters into their equivalent character, spoken voice into phonemes, or noisy sensor readings into a set of fixed values. Usually, the set of categories is small (23 letters, couple of dozen phonemes, etc.)
Others will point out how all these things are better done with specialized algorithms....
I once wrote an ANN to predict the stock market. It succeeded with about 80% accuracy.
The cue here was to first get hold of a couple of million rows of real stock data. I used this data to train the network and prime it for real data. There were about 8-10 input variables and a single output value that would indicate the predicted value of the stock on the next day.
You could also check out the (ancient) ALVINN network where a car learnt to drive by itself by observing road data when a human driver was behind the wheel.
ANNs are also widely used in bioinformatics.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am looking for a graduation project idea in AI and machine learning field...
The idea may require front-end user interface to attract users...
I am thinking of how AI and machine learning can help you in daily life..?
Any help/hint about new interesting ideas ?
Thanks
Edit:
I am talking about practical ideas that may be used in real life... Not an idea to prove theoretical things... Something like a OS (or an add on in existing one) that adapt with your way of work... or a word processor that helps you collecting information about what you are writing..
What about an project that uses Markov-Chain text generation to generate answers for Stack Overflow questions? (^__^)
Two ideas:
Write a current, robust version of SHRDLU with understandable source code.
Write a SHRDLU-like program that manipulates actual code instead of imaginary blocks. Such a tool could be used for manipulating extremely large, complicated programs, including its own code!
Imagine giving commands like the following...
(a) Scan web site X and list any sentences you failed to parse.
(b) Scan document Y and list any grammar rules you didn't need.
(c) Instead of iterating over every element of "proplist" in your "search" function, only process the cdr of "proplist" if the initial call to "lookup" returns nil. After you make the modification, confirm the sentence "pick up a very very big block" will succeed and the sentence "pick up a very and very big block" will fail.
(d) Your "conjoin" grammar currently requires a coordinator word like "and", but that requirement is wrong. Split your "coordination" grammar into "syndetic coordination" and "asyndetic coordination" as follows: conjoins using "and", as in "quickly and quietly, he walked into the bank" are called "syndetic coordinations". Conjoins without a coordinator, as in "quickly, quietly, he walked into the bank" are "asyndetic coordinations". Now scan corpus Z to see if fewer sentences fail to parse.
One component of intelligence is imagination.
It wouldn't take much to Google for "artificial intelligence research projects" and see what other people are doing at other schools. Since it's not a Ph.D., there's no uniqueness requirement for you.
You could also look at Peter Norvig's text to see what's been done before and adapt it.
I'd also recommend doing something with the reams of data that's available to you on the web. Try thinking about "Programming Collective Intelligence" and "Beautiful Data" to see how you could use information to teach a program how to adapt its behavior based on new information (neural nets, genetic algorithms, ant colony algorithms, etc.)
What interests you?
AI is used in a great many areas, so find something you are passionate about and then see how to use AI for it.
For example, if you are interested in games, then you could find an interesting algorithm for the ghosts in Pac-man to chase, and use some more interesting mazes. You may find someone that is interested in doing a 3D project and they could write a 3D version and your algorithm could be more interesting.
Or, you may be interested in robotics. Again, it would be ideal if you could find someone with an interest in making a robot and you could write the AI part. So, for example, you could see if you can figure out how to get a robot to determine the difference between a farm crop and a weed/grass.
Basically, your starting point should be on what really interests you.
Perform a clustering on documents, as done by ex-clusty search engine. This clearly is an attractive application.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
We need to decide between Support Vector Machines and Fast Artificial Neural Network for some text processing project.
It includes Contextual Spelling Correction and then tagging the text to certain phrases and their synonyms.
Which will be the right approach? Or is there an alternate to both of these... Something more appropriate than FANN as well as SVM?
I think you'll get a competitive results from both of the algorithms, so you should aggregate the results... think about ensemble learning.
Update:
I don't know if this is specific enough: use Bayes Optimal Classifier to combine the prediction from each algorithm. You have to train both of your algorithms, then you have to train the Bayes Optimal Classifier to use your algorithms and make optimal predictions based on the input of the algorithms.
Separate your training data in 3:
1st data set will be used to train the (Artificial) Neural Network and the Support Vector Machines.
2nd data set will be used to train the Bayes Optimal Classifier by taking the raw predictions from the ANN and SVM.
3rd data set will be your qualification data set where you will test your trained Bayes Optimal Classifier.
Update 2.0:
Another way to create an ensemble of the algorithms is to use 10-fold (or more generally, k-fold) cross-validation:
Break data into 10 sets of size n/10.
Train on 9 datasets and test on 1.
Repeat 10 times and take a mean accuracy.
Remember that you can generally combine many the classifiers and validation methods in order to produce better results. It's just a matter of finding what works best for your domain.
You might want to also take a look at maxent classifiers (/log linear models).
They're really popular for NLP problems. Modern implementations, which use quasi-newton methods for optimization rather than the slower iterative scaling algorithms, train more quickly than SVMs. They also seem to be less sensitive to the exact value of the regularization hyperparameter. You should probably only prefer SVMs over maxent, if you'd like to use a kernel to get feature conjunctions for free.
As for SVMs vs. neural networks, using SVMs would probably be better than using ANNs. Like maxent models, training SVMs is a convex optimization problem. This means, given a data set and a particular classifier configuration, SVMs will consistently find the same solution. When training multilayer neural networks, the system can converge to various local minima. So, you'll get better or worse solutions depending on what weights you use to initialize the model. With ANNs, you'll need to perform multiple training runs in order to evaluate how good or bad a given model configuration is.
This question is very old. Lot of developments were happened in NLP area in last 7 years.
Convolutional_neural_network and Recurrent_neural_network evolved during this time.
Word Embeddings: Words appearing within similar context possess similar meaning. Word embeddings are pre-trained on a task where the objective is to predict a word based on its context.
CNN for NLP:
Sentences are first tokenized into words, which are further transformed into a word embedding matrix (i.e., input embedding layer) of d dimension.
Convolutional filters are applied on this input embedding layer to produce a feature map.
A max-pooling operation on each filter obtain a fixed length output and reduce the dimensionality of the output.
Since CNN had a short-coming of not preserving long-distance contextual information, RNNs have been introduced.
RNNs are specialized neural-based approaches that are effective at processing sequential information.
RNN memorizes the result of previous computations and use it in current computation.
There are few variations in RNN - Long Short Term Memory Unit (LSTM) and Gated recurrent units (GRUs)
Have a look at below resources:
deep-learning-for-nlp
Recent trends in deep learning paper
You can use Convolution Neural Network (CNN) or Recurrent Neural Network (RNN) to train NLP. I think CNN has achieved state-of-the-art now.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm writing a genetic programming (GP) system (in C but that's a minor detail). I've read a lot of the literature (Koza, Poli, Langdon, Banzhaf, Brameier, et al) but there are some implementation details I've never seen explained. For example:
I'm using a steady state population rather than a generational approach, primarily to use all of the computer's memory rather than reserve half for the interim population.
Q1. In GP, as opposed to GA, when you perform crossover you select two parents but do you create one child or two, or is that a free choice you have?
Q2. In steady state GP, as opposed to a generational system, what members of the population do the children created by crossover replace? This is what I haven't seen discussed. Is it the two parents, or is it two other, randomly-selected members? I can understand if it's the latter, and that you might use negative tournament selection to choose members to replace, but would that not create premature convergence? (After a crossover event the population contains the two original parents plus two children of those parents, and two other random members get removed. Elitism is inherent.)
Q3. Is there a Web forum or mailing list focused on GP? Oddly I haven't found one. Yahoo's GP group is used almost exclusively for announcements, the Poli/Langdon Field Guide forum is almost silent, and GP discussions on general/game programming sites like gamedev.net are very basic.
Thanks for any help you can provide!
Firstly, relax.
There are no "correct" methods in GP. GP is more art than science. Try lots of schemes and pick the ones that work best.
Q1: 1, 2, or many. You choose.
Q2: Replace, 1, 2, all. Or try some elitism.
Q3: You probably won't find forums discussing these questions b/c there are no right/best answers. Sorry.
PS. In my research, crossover never really performed well...
If you can read Python, you may want to take a look at Pyevolve. I am mainly involved in it on the GA side, but it has support for GP as well. May be you can get some hint there.
Q1 is your choice, but single child would probably be more common. Every time you do the lottery selection of parents, you're applying selection pressure, which is what you want.
Q2: Negative tournament selection is exactly the right approach. Yes, losing low-fitness members of the population causes rapid convergence initially, but once your population gets into the hard-to-search part of the solution space, it won't be as cut-and-dried which ones lose the tournament / lottery. What you do have to beware of is stagnation of the gene pool; I suggest monitoring the entropy of the genome to track its heterogeneity. "elitism is inherent" -- Well, yeah, that's the point! ;-)
Q3: comp.ai.genetic is probably your best bet. Sometimes the topic is picked up in game development fora, like on Gamasutra.
P.S. Genetic programming in C?!? How are you assuring the viability of the offspring? Doing genetic programming in a non-homoiconic language is a real challenge.
Check out MetaOptimize.com for your stacky needs.
As Ray, says, it's mostly up to you but typically in a steady-state setup you would only create a single offspring.
Again you have options. I wouldn't replace the parents. If they've been picked as parents based on their fitness you could be eliminating some of the fittest members of the population. Easiest is just to randomly pick an individual to be replaced. Alternatively, you could replace the least fit individual, but that can lead to premature convergence. Another option is to use the same selection strategy that you use to choose parents but use the inverse fitness so that it favours less fit individuals.
You could try comp.ai.genetic on USENET (and Google Groups).
It sounds like some of your questions are not necessarily specific to genetic programming; if that's true, you might have some luck asking the folks over at the NEAT Users Group.
They primarily discuss the Neuroevolution of Augmenting Topologies (or NEAT) algorithm, which is a genetic algorithm used to evolve neural networks. But topics like elitism and crossover strategies are pretty general, and can apply to both GA and GP algorithms.
Otherwise, as Dan and Ray have said, a lot of these decisions are made after experimentation with one's particular software and domain. Try applying your algorithm to different problems and pay attention to how it behaves -- after a while, you'll probably develop an intuition for what works and what doesn't.
I would create an unlimited number of offspring, but only on the basis of success, and let older members of the population die. Lack of fitness can also lead to early death. This just seems to follow a natural order.
Q1. In GP, as opposed to GA, when you perform crossover you select two parents but
do you create one child or two, or is that a free choice you have?
Yes its your choice; but generally, its not advisable to create many individuals with the same parents, because the difference among the individual's trends created by the same parents would be very limited and that could cost processing speed and memory which could have been spent on other individuals showing different trends and behaviors that requires analysis (but creating more individuals cannot be a problem if the evolution process is close to reaching its endpoint).
Q2. In steady state GP...
It is advisable to replace individuals based on the ranking provided by the fitness function you have adopted.