im really new to NN, and im trying to implement it in my recommendation system that gives users recommendations on user similarities.
The thing is that im having 4 different similarities of users by different parameters, and im using weights to make the importance of each similarity in total similarity.
region similarity = 0.5, weightRegion=0.6
interests similarity = 0.3, weightInterest=0.8
education similarity = 0.75, weightEducation=1.1
positions similarity = 0.6, weightPositions=1.5
so calculating total similarity will be multiplied sum divided by sum of the weights: (0.5*0.6+0.3*0.8+0.75*1.1+0.6*1.5)/4
//im dividing by sum of weights to put parameter in {0..1}
So the thing is i need to control those weights by the user rating (user clicks rating from 1 to 10 and weights r corrected)
I've built such NN:
So what im doing is:
n=0.25 (learning k);
rating=0.7 (that is my 7 rating);
net5=x1*w15+x2*w25+x3*w35+x4*w45;
out5=1/(1-pow(e,-net5));
real=out5*(1+1-rating);
err=out5*(1-out5)*(real-out5);
w15n=w15+errnx1;
w25n=w25+errnx2;
w35n=w35+errnx3;
w45n=w45+errnx4;
(im sry for code formatting, it kept saying its not properly formatted)
What am I doing wrong? cause results of such correcting arent good at all.
Thanks
I think you are going the wrong way. Backpropagation isn't a good choice for this type of learning (somehow incremental learning).
To use backpropagation you need some data, say 1000 data where different types of similaritiy (input) and the True Similarity (output) are given. Then weights will update and update until error rate comes down. And besides you need test data set too that will make you sure the result network will do good even for similarity values it didn't see during training.
Related
I have a dataset where I have a sparse utility matrix (user-product) with binary input: 1 if the user 𝑖 bought the product 𝑗, and 0 if it hasn't.
However it has a different meaning on the test set, 0 means that we don't know if the user bought this product, and 1 means that we're sure that the user bought that given product.
I need to get for each user in each product the probability that a user 𝑖 bought a product 𝑗 in the test set. For this I wish to use different matrix factorization techniques like FunkSVD, NMF or SVD++ , but I'm quite confused:
These techniques would only allow me to get the label (1 or 0) on test set, but I need to compute a probability of getting 1 and not the label in it self.
How can I approach this problem ? Or do I treat it as a classification problem and then use all common classification techniques ?
Okay, One approach that might help based on Recurrent-Knowledge-Graph-Embedding.
Try converting user-item interactions into a knowledge graph and mine the top-n paths in the graph between user_i and item_j. Build an RNN model as described in the paper and its code, to get probabilities.
I have a problem and was wondering if I can use deep-learning to solve it.
I have a lists of 7 features, and for each list I have 7 scores.
For examples for the features:
[0.2,0.6,0.2,0.6,0.1,0.3,0.1]
I have the following scores:
[100,0,123,2,14,15,2]
and for the features:
[0.1,0.2,0.3,0.6,0.5,0.1,0.2]
I have the following scores:
[10,10,13,22,4,135,22]
etc..
Any ideas of how to utilize deep learning to train a network that giving a list of features will give me back the correct scores.
Thanks
You have the basic setup here for a regression problem. You could try solving this problem using a neural network toolkit. I wrote a toolkit called theanets that might help, so I'll give a simple example of how you might use it:
import numpy as np
import theanets
# set up data arrays: X is input, Y is target output
X = np.array([
[0.2,0.6,0.2,0.6,0.1,0.3,0.1],
[0.1,0.2,0.3,0.6,0.5,0.1,0.2],
], 'f')
Y = np.array([
[100,0,123,2,14,15,2],
[10,10,13,22,4,135,22],
], 'f')
# set up a regression model:
# map from X to Y using one hidden layer.
exp = theanets.Experiment(
theanets.Regressor,
(X.shape[1], 100, Y.shape[1]))
# train the model using rmsprop.
exp.train([X, Y], algorithm='rmsprop')
# predict outputs for some inputs.
Yhat = exp.network.predict(X)
There are several options for configuring and training your model, have a look at the documentation for more info.
There are also many, many other neural network toolkits out there, here are just a few popular ones that I'm familiar with:
Lasagne
Keras
Caffe
You might want to give these a try to see whether they fit better with your mental model of the problem you're trying to solve.
You generate a big number of neural networks
You give a fitness score to each neural net based on the results(the higher the fitness score the better)
You sort the neural nets by their fitness score
You take the first x%
You apply small mutations to each selected neural net.
Repeat 2-5 until results are satisfactory.
That big number mentioned in the first step should be roughly equal to:
(100/x)^generationCount
where the x here is the same one as in step 4 and generationCount is the amount of generations until final result.
I wrote multilayer perceptron implementation (on Python) which is able to classify Iris dataset. It was trained by backpropagation algorithm and uses sigmoid actiovation functions on a hidden and output layers.
But now I want to change it to be able to approximate house price.
(I have dataset of ~300 estates with prices and input parameters like rooms, location etc.)
Now output of my perceptron is in range [0;1]. But as far as I understand if I want to get resulting house price on the output neuron I need to change that activation function somehow right?
Can somebody help me?
I'm new to neural networks
Thanks in advance.
Assuming, for instance, that houses price between $1 and $1,000,000, then you can just map the 0...1 range to the final price range both for the training and for the testing. Just note that 300 estates is a fairly small data set.
To be precise, if a house is $500k, then the target training output becomes 0.5. You can basically divide by your maximum possible home value to get the target training amount. When you get the output value you multiple by the maximum home value to get the predicted price.
So, view the output of the neural network as the percentage of the total cost.
I would like to use a genetic program (gp) to estimate the probability of an 'outcome' from an 'event'. To train the nn I am using a genetic algorithm.
So, in my database I have many events, with each event containing many possible outcomes.
I will give the gp a set of input variables that relate to each outcome in each event.
My questions is - what should the fitness function be in the gp be ????
For instance, right now I am giving the gp a set of input data (outcome input variables), and a set of target data (1 if outcome DID occur, 0 if outcome DIDN'T occur, with the fitness function being the mean squared error of the outputs and targets). I then take the sum of each output for each outcome, and divide each output by the sum (to give the probability). However, I know for sure that this is not the right way to be doing this.
For clarity, this is how I am CURRENTLY doing this:
I would like to estimate the probability of 5 different outcomes occurring in an event:
Outcome 1 - inputs = [0.1, 0.2, 0.1, 0.4]
Outcome 1 - inputs = [0.1, 0.3, 0.1, 0.3]
Outcome 1 - inputs = [0.5, 0.6, 0.2, 0.1]
Outcome 1 - inputs = [0.9, 0.2, 0.1, 0.3]
Outcome 1 - inputs = [0.9, 0.2, 0.9, 0.2]
I will then calculate the gp output for each input:
Outcome 1 - output = 0.1
Outcome 1 - output = 0.7
Outcome 1 - output = 0.2
Outcome 1 - output = 0.4
Outcome 1 - output = 0.4
The sum of the outputs for each outcome in this event would be: 1.80. I would then calculate the 'probability' of each outcome by dividing the output by the sum:
Outcome 1 - p = 0.055
Outcome 1 - p = 0.388
Outcome 1 - p = 0.111
Outcome 1 - p = 0.222
Outcome 1 - p = 0.222
Before you start - I know that these aren't real probabilities, and that this approach does not work !! I just put this here to help you understand what I am trying to achieve.
Can anyone give me some pointers on how I can estimate the probability of each outcome ? (also, please note my maths is not great)
Many thanks
I understand the first part of your question: What you described is a classification problem. You're learning if your inputs relate to whether an outcome was observed (1) or not (0).
There are difficulties with the second part though. If I understand you correctly you take the raw GP output for a certain row of inputs (e.g. 0.7) and treat it as a probability. You said this doesn't work, obviously. In GP you can do classification by introducing a threshold value that splits your classes. If it's bigger than say 0.3 the outcome should be 1 if it's smaller it should be 0. This threshold isn't necessarily 0.5 (again it's just a number, not a probability).
I think if you want to obtain a probability you should attempt to learn multiple models that all explain your classification problem well. I don't expect you have a perfect model that explains your data perfectly, respectively if you have you wouldn't want a probability anyway. You can bag these models together (create an ensemble) and for each outcome you can observe how many models predicted 1 and how many models predicted 0. The amount of models that predicted 1 divided by the number of models could then be interpreted as a probability that this outcome will be observed. If the models are all equally good then you can forget weighing between them, if they're different in quality of course you could factor these into your decision. Models with less quality on their training set are less likely to contribute to a good estimate.
So in summary you should attempt to apply GP e.g. 10 times and then use all 10 models on the training set to calculate their estimate (0 or 1). However, don't force yourself to GP only, there are many classification algorithms that can give good results.
As a sidenote, I'm part of the development team of a software called HeuristicLab which runs under Windows and with which you can run GP and create such ensembles. The software is open source.
AI is all about complex algorithms. Think about it, the downside is very often, that these algorithms become black boxes. So the counterside to algoritms, such as NN and GA, are they are inherently opaque. That is what you want if you want to have a car driving itself. On the other hand this means, that you need tools to look into the black box.
What I'm saying is that GA is probably not what you want to solve your problem. If you want to solve AI types of problems, you first have to know how to use standard techniques, such as regression, LDA etc.
So, combining NN and GA is usually a bad sign, because you are stacking one black box on another. I believe this is bad design. An NN and GA are nothing else than non-linear optimizers. I would suggest to you to look at principal component analysis (PDA), SVD and linear classifiers first (see wikipedia). If you figure out to solve simple statistical problems move on to more complex ones. Check out the great textbook by Russell/Norvig, read some of their source code.
To answer the questions one really has to look at the dataset extensively. If you are working on a small problem, define the probabilities etc., and you might get an answer here. Perhaps check out Bayesian statistics as well. This will get you started I believe.
I once wrote a Tetris AI that played Tetris quite well. The algorithm I used (described in this paper) is a two-step process.
In the first step, the programmer decides to track inputs that are "interesting" to the problem. In Tetris we might be interested in tracking how many gaps there are in a row because minimizing gaps could help place future pieces more easily. Another might be the average column height because it may be a bad idea to take risks if you're about to lose.
The second step is determining weights associated with each input. This is the part where I used a genetic algorithm. Any learning algorithm will do here, as long as the weights are adjusted over time based on the results. The idea is to let the computer decide how the input relates to the solution.
Using these inputs and their weights we can determine the value of taking any action. For example, if putting the straight line shape all the way in the right column will eliminate the gaps of 4 different rows, then this action could get a very high score if its weight is high. Likewise, laying it flat on top might actually cause gaps and so that action gets a low score.
I've always wondered if there's a way to apply a learning algorithm to the first step, where we find "interesting" potential inputs. It seems possible to write an algorithm where the computer first learns what inputs might be useful, then applies learning to weigh those inputs. Has anything been done like this before? Is it already being used in any AI applications?
In neural networks, you can select 'interesting' potential inputs by finding the ones that have the strongest correlation, positive or negative, with the classifications you're training for. I imagine you can do similarly in other contexts.
I think I might approach the problem you're describing by feeding more primitive data to a learning algorithm. For instance, a tetris game state may be described by the list of occupied cells. A string of bits describing this information would be a suitable input to that stage of the learning algorithm. actually training on that is still challenging; how do you know whether those are useful results. I suppose you could roll the whole algorithm into a single blob, where the algorithm is fed with the successive states of play and the output would just be the block placements, with higher scoring algorithms selected for future generations.
Another choice might be to use a large corpus of plays from other sources; such as recorded plays from human players or a hand-crafted ai, and select the algorithms who's outputs bear a strong correlation to some interesting fact or another from the future play, such as the score earned over the next 10 moves.
Yes, there is a way.
If you choose M selected features there are 2^M subsets, so there is a lot to look at.
I would to the following:
For each subset S
run your code to optimize the weights W
save S and the corresponding W
Then for each pair S-W, you can run G games for each pair and save the score L for each one. Now you have a table like this:
feature1 feature2 feature3 featureM subset_code game_number scoreL
1 0 1 1 S1 1 10500
1 0 1 1 S1 2 6230
...
0 1 1 0 S2 G + 1 30120
0 1 1 0 S2 G + 2 25900
Now you can run some component selection algorithm (PCA for example) and decide which features are worth to explain scoreL.
A tip: When running the code to optimize W, seed the random number generator, so that each different 'evolving brain' is tested against the same piece sequence.
I hope it helps in something!