It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
What are the distance functions so far implemented to find the distance between two nodes in distributed networks like p2p? i mean if each leaf node in a p2p tree network represents some data, there should be some defined ways to find distance between these nodes. I want to know the general practices and the distributed functions that help us to determine the similarity between these nodes.
If my question itself is wrong please forgive me.
I can think of a few distance functions like this. It depends on what your application cares about. What are you using this distance function for?
Latency. When nodes talk to each other they directly measure the Round Trip Time (RTT).
Bandwidth. When nodes talk to each other they directly measure their bytes/sec transfer rate.
IP prefix. Nodes with very similar IPs are probably close together, so 149.89.1.24 and 149.89.1.100 are probably very close together. This is a very coarse heuristic.
My advice is to directly and continuously measure whatever you pick as your distance metric. The distance metric will change over time, so measure it continuously. Any estimate that isn't based on the individual nodes taking measurements is likely to be wildly inaccurate. You should also remember that network distances are asymmetric. Packets the flow from node A to node B might take an entirely different route than those flowing from B to A.
What are the distance functions so far implemented to find the distance between two nodes in distributed networks like p2p?
It depends of the method you are using (see CAN, Kademlia, Pastry (DHT), Tapestry (DHT), Koorde). But keep in mind that these distances are theoretical and not necessarily pratical.
In a real P2P implementation on ipv4, all NAT-ed peers only need a reachable peer with a public address. Meaning the 'distance' between two private peers is at most 2.
Related
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have to do a project for my college course in data structures using c and was wondering if anyone can tell me any real life uses of data structures so that I can base my project on it.
Please keep in mind that it is only my first year of programming in c so I currently do not have the skills to write very advanced code.
Since this is your first year of college, I wouldn't go into the depths of datastructures and their uses.
Simplest use of a data structure: A English-to-English Dictionary, which can be built using a Hash Table.
From here, you can go into depths of DS
Datastructures in OS design, such as memory manager (Linked List + Hash-Map),
BTrees in Database Design
Trees in Filesystems
Graphs in electronic circuit simulation, AI etc
many many more.
Well, data structures improve the logic or performance of manipulations on your data. To see the latter, you could try the following. Generate a list of a million or more random numbers and try to find one of them in particular.
Try comparing the performance of the following two representations: an array and a sorted binary tree.
There are lots of real usages of data structures, whether it's in your OS, in Databases, etc.
Think of the case of (not only) MySQL that uses B-Trees to manage records (http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html).
Look at list of data structures from wikipedia. Most of the data structures have real world examples or applications section on their own description page.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Here might not be the proper place to ask this question but I didn't find any better place to ask it. I have a program that have for example 10 parameters. Every time I ran it, it could lead to 3 results. 0, 0.5 or 1. I don't know how the parameters would influence the last result. I need something to little by little improve my program so it gets more 1s and less 0s.
First, just to get the terminology right, this is really a "search" problem, not a "machine learning" problem (you're trying to find a very good solution, not trying to recognize how inputs relate to outputs). Your problem sounds like a classic "function optimization" search problem.
There are many techniques that can be used. The right one depends on a few different factors, but the biggest question is the size and shape of the solution space. The biggest question there is "how sensitive is the output to small changes in the inputs?" If you hold all the inputs except one the same and make a tiny change, are you going to get a huge change in the output or just a small change? Do the inputs interact with each other, especially in complex ways?
The smaller and "smoother" the solution space (that is, the less sensitive it is to tiny changes in inputs), the more you would want to pursue straightforward statistical techniques , guided search, or perhaps, if you wanted something a little more interesting, simulated annealing.
The larger and more complex the solution space, the more that would guide you towards either more sophisticated statistical techniques or my favorite class of algorithms, which are genetic algorithms, which can very rapidly search a large solution space.
Just to sketch out how you might apply genetic algorithms to your problem, let's assume that the inputs are independent from each other (a rare case, I know):
Create a mapping to your inputs from a series of binary digits 0011 1100 0100 ...etc...
Generate a random population of some significant size using this mapping
Determine the fitness of each individual in the population (in your case, "count the 1s" in the output)
Choose two "parents" by lottery:
For each half-point in the output, an individual gets a "lottery ticket" (in other words, an output that has 2 "1"s and 3 "0.5"s will get 7 "tickets" while one with 1 "1" and 2 "0.5"s will get 4 "tickets")
Choose a lottery ticket randomly. Since "more fit" individuals will have more "tickets" this means that "more fit" individuals will be more likely to be "parents"
Create a child from the parents' genomes:
Start copying one parents genome from left to right 0011 11...
At every step, switch to the other parent with some fixed probability (say, 20% of the time)
The resulting child will have some amount of one parents genome and some amount of the other's. Because the child was created from "high fitness" individuals, it is likely that the child will have a fitness higher than the average of the current generation (although it is certainly possible that it might have lower fitness)
Replace some percentage of the population with children generated in this manner
Repeat from the "Determine fitness" step... In the ideal case, every generation will have an average fitness that is higher than the previous generation and you will find a very good (or maybe even ideal) solution.
Are you just trying to modify the parameters so the results come out to 1? It sounds like the program is a black box where you can pick the input parameters and then see the results. Since that is the case I think it would be best to choose a range of input parameters, cycle through those inputs, and view the outputs to try to discern a pattern. If you could automate it it'll help out a lot. After you run through the data you may be able to spot check to see which parameter give you which results, or you could apply some machine learning techniques to determine which parameters lead to which outputs.
As Larry said, looks like a combinatorial search and the solution will depends on the "topology" of the problem.
If you can, try to get the Algorithm Design Manuel book (S. Skiena), it has a chapter on this that can help determine the good method for this problem...
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have to implement a sparse matrix (a matrix that has predominantly zeroes, so you only record values different than 0), but I have to implement it using a binary search tree.
EDIT:
So now I'm thinking of implementing it by using the line / column as a key, but what do I use as the root of that tree ?
/EDIT
I hoped once I researched binary search trees I would understand how this implementation would be beneficial, or at the very least possible, but I for the life of me can't figure it out.
I have tried google to no avail, and I myself cannot imagine how to even attempt in doing so.
I haven't decided on the language I shall be implementing this in yet, so I need no code examples, my problem is logic. I need to see how this would even work.
P.S. I have no idea what tags to use, if someone could edit some in, It'd be much appreciated.
To use a binary tree you need to have a key that is distinct for each (possible) entry in the matrix. So if you want to lookup (2, 4) in a matrix [100, 100] then the key could be something like "002004". With this key you can insert a value into the tree.
For each dimension the key would be longer, so you also might consider a hash function to hash the coordinates of the cell and within the tree you have a list of entries for this hash key. The tree is then only an index to the right list. Within the list you need to perform a sequential search then. Or if you order the list you could improve by using a binary search.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
According to Wikipedia (which is a bad source, I know) A neural network is comprised of
An input layer of A neurons
Multiple (B) Hidden layers each comprised of C neurons.
An output layer of "D" neurons.
I understand what does input and output layers mean.
My question is how to determine an optimal amount of layers and neuron-per-layer?
What is the advantage/disadvantage of a increasing "B"?
What is the advantage/disadvantage of a increasing "C"?
What is the difference between increasing "B" vs. "C"?
Is it only the amount of time (limits of processing power) or will making the network deeper limit quality of results and should I focus more on depth (more layers) or on breadth (more neurons per layer)?
Answer 1. One Layer will model most of the problems OR at max two layers can be used.
Answer 2. If an inadequate number of neurons are used, the network will be unable to model complex data, and the resulting fit will be poor. If too many neurons are used, the training time may become excessively long, and, worse, the network may over fit the data. When overfitting $ occurs, the network will begin to model random noise in the data. The result is that the model fits the training data extremely well, but it generalizes poorly to new, unseen data. Validation must be used to test for this.
$ What is overfitting?
In statistics, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. A model which has been overfit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data.
The concept of overfitting is important in machine learning. Usually a learning algorithm is trained using some set of training examples, i.e. exemplary situations for which the desired output is known. The learner is assumed to reach a state where it will also be able to predict the correct output for other examples, thus generalizing to situations not presented during training (based on its inductive bias). However, especially in cases where learning was performed too long or where training examples are rare, the learner may adjust to very specific random features of the training data, that have no causal relation to the target function. In this process of overfitting, the performance on the training examples still increases while the performance on unseen data becomes worse.
Answer 3. Read Answer 1 & 2.
Supervised Learning article on wikipedia (http://en.wikipedia.org/wiki/Supervised_learning) will give you more insight on what are the factors which are relly important with respect to any supervised learning system including Neural Netowrks. The article talks about Dimensionality of Input Space, Amount of training data, Noise etc.
The number of layers/nodes depends on the classification task and what you expect of the NN. Theoretically, if you have a linearly separable function/decision (e.g the boolean AND function), 1 layer (i.e only the input layer with no hidden layer) will be able to form a hyperplane and would be enough. If your function isn't linearly separable (e.g the boolean XOR), then you need hidden layers.
With 1 hidden layer, you can form any, possibly unbounded convex region. Any bounded continuous function with a finite mapping can be represented. More on that here.
2 hidden layers, on the other hand, are capable of representing arbitrarily complex decision boundaries. The only limitation is the number of nodes. In a typical 2-hidden layer network, first layer computes the regions and the second layer computes an AND operation (one for each hypercube). Lastly, the output layer computes an OR operation.
According to Kolmogorov's Theorem, all functions can be learned by a 2-hidden layer network and you never ever need more than 2 hidden layers. However, in practice, 1-hidden-layer almost always does the work.
In summary, fix B=0 for linearly separable functions and B=1 for everything else.
As for C and the relationship of B and C, have a look The Number of Hidden Layers. It provides general information and mentions underfitting, overfitting.
The author suggests one of the following as a rule of thumb:
size of the input layer < C < size of the output layer.
C = 2/3 the size of the input layer, plus the size of the output layer.
C < twice the size of the input layer.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
How would one design a neural network for the purpose of a recommendation engine. I assume each user would require their own network, but how would you design the inputs and the outputs for recommending an item in a database. Are there any good tutorials or something?
Edit: I was more thinking how one would design a network. As in how many input neurons and how the output neurons point to a record in a database. Would you have say 6 output neurons, convert it to an integer (which would be anything from 0 - 63) and that is the ID of the record in the database? Is that how people do it?
I would suggest looking into neural networks using unsupervised learning such as self organising maps. It's very difficult to use normal supervised neural networks to do what you want unless you can classify the data very precisely for learning. self organising maps don't have this problem because the network learns the classification groups all on their own.
have a look at this paper which describes a music recommendation system for music
http://www.springerlink.com/content/xhcyn5rj35cvncvf/
and many more papers written about the topic from google scholar
http://www.google.com.au/search?q=%09+A+Self-Organizing+Map+Based+Knowledge+Discovery+for+Music+Recommendation+Systems+&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a&safe=active
First you have to decide what exactly you are recommending and under what circumstances. There are many things to take into account. Are you going to consider the "other users who bought X also bought Y?" Are you going to only recommend items that have a similar nature to each other? Are you recommending items that have a this-one-is-more-useful-with-that-one type of relationship?
I'm sure there are many more decisions, and each one of them has their own goals in mind. It would be very difficult to train one giant network to handle all of the above.
Neural networks all boil down to the same thing. You have a given set of inputs. You have a network topology. You have an activation function. You have weights on the nodes' inputs. You have outputs, and you have a means to measure and correct error. Each type of neural network might have its own way of doing each of those things, but they are present all the time (to my limited knowledge). Then, you train the network by feeding in a series of input sets that have known output results. You run this training set as much as you'd like without over or under training (which is as much your guess as it is the next guy's), and then you're ready to roll.
Essentially, your input set can be described as a certain set of qualities that you believe have relevance to the underlying function at hand (for instance: precipitation, humidity, temperature, illness, age, location, cost, skill, time of day, day of week, work status, and gender may all have an important role in deciding whether or not person will go golfing on a given day). You must therefore decide what exactly you are trying to recommend and under what conditions. Your network inputs can be boolean in nature (0.0 being false and 1.0 being true, for instance) or mapped in a pseudo-continuous space (where 0.0 may mean not at all, .45 means somewhat, .8 means likely, and 1.0 means yes). This second option may give you the tools to map confidence level for a certain input, or simple a math calculation you believe is relevant.
Hope this helped. You didn't give much to go on :)