MPI Triangular Topology - c

I have some computation to do over an (upper-)triangular matrix and I was thinking using MPI for that. It seems that it would be convenient to define a proper topology for that. The Cartesian topology goes in that direction but not quite what I need yet. I would need a topology that gives me the rank of the process on the strict upper triangular matrix, i.e. strictly above the diagonal.
The type of triangular topology I would need would look as follows (say for 3x3 cartesian grid):
i j rank
0 0 NULL
0 1 1
0 2 2
1 0 NULL
1 1 NULL
1 2 3
2 0 NULL
2 1 NULL
2 2 NULL
Thanks for your help!

MPI does not provide the type of topology that you are asking for. It provides either the Cartesian one or a general graph topology. With the general graph topology, each rank has a list of neighbouring ranks, but it's hard to map a half grid onto that.
You might want to write your own set of routines to manage such kind of topology. What you actually have to reimplement are the functions to map coordinates to rank and vice versa, as well as the functions to find the neighbours of a given rank. You can precompute the mapping and cache it in the corresponding communicator as attributes, namely using the MPI_COMM_GET_ATTR and MPI_COMM_SET_ATTR calls.

Related

How come random weight initiation is better then just using 0 as weights in ANN?

In a trained neural net the weight distribution will fall close around zero. So it makes sense for me to initiate all weights to zero. However there are methods such as random assignment for -1 to 1 and Nguyen-Widrow that outperformes zero initiation. How come these random methods are better then just using zero?
Activation & learning:
Additionally to the things cr0ss said, in a normal MLP (for example) the activation of layer n+1 is the dot product of the output of layer n and the weights between layer n and n + 1...so basically you get this equation for the activation a of neuron i in layer n:
Where w is the weight of the connection between neuron j (parent layer n-1) to current neuron i (current layer n), o is the output of neuron j (parent layer) and b is the bias of current neuron i in the current layer.
It is easy to see initializing weights with zero would practically "deactivate" the weights because weights by output of parent layer would equal zero, therefore (in the first learning steps) your input data would not be recognized, the data would be negclected totally.
So the learning would only have the data supplied by the bias in the first epochs.
This would obviously render the learning more challenging for the network and enlarge the needed epochs to learn heavily.
Initialization should be optimized for your problem:
Initializing your weights with a distribution of random floats with -1 <= w <= 1 is the most typical initialization, because overall (if you do not analyze your problem / domain you are working on) this guarantees some weights to be relatively good right from the start. Besides, other neurons co-adapting to each other happens faster with fixed initialization and random initialization ensures better learning.
However -1 <= w <= 1 for initialization is not optimal for every problem. For example: biological neural networks do not have negative outputs, so weights should be positive when you try to imitate biological networks. Furthermore, e.g. in image processing, most neurons have either a fairly high output or send nearly nothing. Considering this, it is often a good idea to initialize weights between something like 0.2 <= w <= 1, sometimes even 0.5 <= w <= 2 showed good results (e.g. in dark images).
So the needed epochs to learn a problem properly is not only dependent on the layers, their connectivity, the transfer functions and learning rules and so on but also to the initialization of your weights.
You should try several configurations. In most situations you can figure out what solutions are adequate (like higher, positive weights for processing dark images).
Reading the Nguyen article, I'd say it is because when you assign the weight from -1 to 1, you are already defining a "direction" for the weight, and it will learn if the direction is correct and it's magnitude to go or not the other way.
If you assign all the weights to zero (in a MLP neural network), you don't know which direction it might go to. Zero is a neutral number.
Therefore, if you assign a small value to the node's weight, the network will learn faster.
Read Picking initial weights to speed training section of the article. It states:
First, the elements of Wi are assigned values from a uniform random distributation between -1 and 1 so that its direction is random. Next, we adjust the magnitude of the weight vectors Wi, so that each hidden node is linear over only a small interval.
Hope it helps.

How to normalize multiple array of different size in matlab

I use set of images for image processing in which each image generates unique code (Freeman chain code). The size of array for each image varies. However the value ranges from 0 to 7. For e.g. First image creates array of 3124 elements. Second image creates array of 1800 elements.
Now for further processing, I need a fixed size of those array. So, is there any way to Normalize it ?
There is a reason why you are getting different sized arrays when applying a chain code algorithm to different images. This is because the contours that represent each shape are completely different. For example, the letter C and D will most likely contain chain codes that are of a different length because you are describing a shape as a chain of values from a starting position. The values ranging from 0-7 simply tell you which direction you need to look next given the current position of where you're looking in the shape. Usually, chain codes have the following convention:
3 2 1
4 x 0
5 6 7
0 means to move to the east, 1 means to move north east, 2 means to move north and so on. Therefore, if we had the following contour:
o o x
o
o o o
With the starting position at x, the chain code would be:
4 4 6 6 0 0
Chain codes encode how we should trace the perimeter of an object given a starting position. Now, what you are asking is whether or not we can take two different contours with different shapes and represent them using the same number of values that represent their chain code. You can't because of the varying length of the chain code.
tl;dr
In general, you can't. The different sized arrays mean that the contours that are represented by those chain codes are of different lengths. What you are actually asking is whether or not you can represent two different and unrelated contours / chain codes with the same amount of elements.... and the short answer is no.
What you need to think about is why you want to try and do this? Are you trying to compare the shapes between different contours? If you are, then doing chain codes is not the best way to do that due to how sensitive chain codes are with respect to how the contour changes. Adding the slightest bit of noise would result in an entirely different chain code.
Instead, you should investigate shape similarity measures instead. An authoritative paper by Remco Veltkamp talks about different shape similarity measures for the purposes of shape retrieval. See here: http://www.staff.science.uu.nl/~kreve101/asci/smi2001.pdf . Measures such as the Hausdorff distance, Minkowski distance... or even simple moments are some of the most popular measures that are used.

pattern recognition - "is this a pattern?"

I have a large vector of numbers, say 500 numbers. I would like a program to detect patterns (reoccurrence in this case) in such vector based on following rules:
A sequence of numbers is a pattern if:
The size of the sequence is between 3 and 20 numbers.
The RELATIVE positions of the numbers in sequence is repeated at
least one other time in a vector. So let's say if I have a sequence
(1,4,3) and then (3,6,5) somewhere else in the vector then (1,4,3) is
a pattern. (as well as (2,5,4), (3,6,5) etc.)
The sequences can't intersect. So, a vector (1,2,3,4,5) does not
contain patterns (1,2,3) and (3,4,5)(we can't use the same number for
both sequences). However, (1,2,3,3,4,5) does contain a pattern
(1,2,3) (or (3,4,5))
A subset A of a pattern B is a pattern ONLY IF A appears somewhere
else outside B. So, a vector (1,2,3,4,7,8,9,2,3,4,5) would contain
patterns (1,2,3,4) and (1,2,3), because (1,2,3,4) is repeated (in a
form of (2,3,4,5)) and (1,2,3) is repeated (in a form (7,8,9)).
However, if the vector was (1,2,3,4,2,3,4,5) the only pattern will
be (1,2,3,4), because (1,2,3) appeares only in context of (1,2,3,4).
I'd like to know several things:
First of all I hope the rules don't go against each other. I made them myself so there might be a clash somewhere that I didn't notice, please let me know if you do notice it.
Secondly, how would one implement such system in the most efficient way? Maybe someone can point out towards some particular literature on the subject? I could go number by number starting with searching a sequence repetition for all subsets of 3, then 4,5 and till 20. But that seems to be not very efficient..
I am interested in implementation of such system in C, but any general guidance is very welcome.
Thank you in advance!
Just a couple of observations:
If you're interested in relative values, then your first step should be to calculate the differences between adjacent elements of the vector, e.g.:
Original numbers:
1 4 3 2 5 1 1 3 6 5 6 2 5 4 4 4 1 4 3 2
********* ********* ********* *********
Difference values:
3 -1 -1 3 -4 0 2 3 -1 1 4 3 -1 -3 0 -3 3 -1 -1
****** ****** ****** ******
Once you've done that, you could use an autocorrelation method to look for repeated patterns in the data. This can be computed in O(n log n) time, and possibly even faster if you're only concerned with exact matches.

how to solve this with kmeans clustering and use cosine similiraty

can anyone tell to me how k-means clustering working on textmining..
and i using cosine similarity as the distance metric.
nim 310910022 320910044 310910043 310910021
access 0 2 3 5
abdi 1 0 0 0
actual 5 0 0 1
arrow 0 1 1 2
this data is on listview
How can I do that in VB.net? to get any cluster and trending topic of the term ?
Thank in advance
First I would separate the problem into two parts:
Computing the k-means clustering assignments
Getting the data from the GUI (you mention the data is on a listview)
I assume 2 is straight forward and you just need help on 1.
I would start by re-writing the code to just read a TSV text file of the data as you specified. This will make things a lot easier to debug.
Then ask if you want to implement the kmeans algorithm yourself or use a library.
If you want to implement it, here is a link to the pseudo code
http://www.scribd.com/doc/89373376/K-Means-Pseudocode
You can also search up other kmeans pseudocode.
If you want to use a library to just "run" your data against a kmeans algorithm, here is one example in python/scipy.
http://glowingpython.blogspot.com/2012/04/k-means-clustering-with-scipy.html
Regardless of which approach you use, realize that kmeans is non-deterministic and you may get different answers everytime you run it. I would recommend computing against a known validation set to see if the data is roughly what you think it should be.

FFT and convolution

Im writing for school 2dFFT using on image filtering.
And I have problem with filter matrix.
I made my fft so it accepts 2^n input, and all filter matrix are odd numbers.
So I need solution to somehow transform filter matrix to acceptable input for my function.
I have next idea and Im not sure how it will work.
If I have filter matrix:
1 2 3
4 5 6
7 8 9
To transform it to:
0 0 0 0
1 2 3 0
4 5 6 0
7 8 9 0
And when Im matching "center" of matrix with my pixel, match center of "submatrix" and after that extract values I need.
Is that possible?
Also Can someone tell me what is max size of filter I can get? Is it larger than lets say 32x32?
Filter masks are used to express filters with compact support. Compact support means that the signal has non-zero values only in a limited range. By extending your filter mask with zero values, you are in fact doing a natural thing. The zeros are part of the original filter.
The real problem however is a different thing. I assume that you use FFT according to the convolution theorem. For that, you need element-wise multiplication. You can only do element-wise multiplication when both your filter and your signal have the same number of elements. So you would need to extend your filter to the signal size (using zeros).
There is no limit on filter mask size. For convolution the only restriction is compact support (as explained above).

Resources