How do I fill a histogram in Matlab if one gets extremely many different copies of the vector to be histogramed? - arrays

I was trying to collect statistics of a 6D vector and plot a 1D histogram for each coordinate. I get 729000000 different copies of this vector (each 6 dimensional). For this I create an array of zeros of size 729000000x6 before I get any of the actual W's and this seems to be a problem in matlab since it says:
Error using zeros
Requested 729000000x6 (32.6GB) array exceeds maximum array size preference. Creation of arrays
greater than this limit may take a long time and cause MATLAB to become unresponsive. See array
size limit or preference panel for more information.
The reason I did this at first was because it was easy to fill W_history and then just feed it to the histogram plotter:
histogram(W_history(:,d),nbins,'Normalization','probability')
however filling W_history seemed impossible for high number of copies of W. Is there a way to do this in matlab automatically? It feels that there should be and didn't want to re-invent the wheel.
I am sure I could potentially create for each coordinate some array of counters where I count how many times a specific value of the coordinate W falls. However, implementing that and having the checks for in which bin each one should fall seemed inefficient or even unnecessary. Is this really the only solution or what do matlab experts people recommend? Is this re-inventing the wheel? Seems also inefficient if I implement it myself?
Also, I thought I could manually have matlab put thing in memory then bring them back etc (as in store W_history in disk as it fills and then put more back in disk as it fills and eventually somehow plug it in to the histogram plotter), that seemed overwork. I hope I can avoid a solution like this one. It feels a wrong solution since it should be "easy" and high level to use matlab and going down to disk and memory doesn't seem to me what matlab is intended.
Currently through the comment that was given the best solution that I have so far is using histcounts as follow:
for i=2:iter+1
%
W = get_new_W(W)
%
[W_hist_counts_current, edges2] = histcounts(W,edges);
W_hist_counts = W_hist_counts + W_hist_counts_current;
end
however, after this it seems difficult to convert W_hist_counts to pdf/probability or other values since it seems they have to be processed manually. Is there no official way to do this processing without the user having to implement the normalizations again?

Related

Data Structure for a recursive function with two parameters one of which is Large the other small

Mathematician here looking for a bit of help. (If you ever need math help I'll try to reciprocate on math.stackexchange!) Sorry if this is a dup. Couldn't find it myself.
Here's the thing. I write a lot of code (mostly in C) that is extremely slow and I know it could be sped up considerably but I'm not sure what data structure to use. I went to school 20 years ago and unfortunately never got to take a computer science course. I have watched a lot of open-course videos on data structures but I'm still a bit fuddled never taking an actual class.
Mostly my functions just take integers to integers. I almost always use 64-bit numbers and I have three use cases that I'm interested in. I use the word small to mean no more than a million or two in quantity.
Case 1: Small numbers as input. Outputs are arbitrary.
Case 2: Any 64-bit values as input, but only a small number of them. Outputs are arbitrary.
Case 3: Two parameter functions with one parameter that's small in value (say less than two million), and the other parameter is Large but with only a small number of possible inputs. Outputs are arbitrary.
For Case 1, I just make an array to cache the values. Easy and fast.
For Case 2, I think I should be using a hash. I haven't yet done this but I think I could figure it out if I took the time.
Case 3 is the one I'd like help with and I'm not even sure what I need.
For a specific example take a function F(n,p) that takes large inputs n for the first parameter and a prime p for the second. The prime is at most the square root of n. so even if n is about 10^12, the primes are only up to about a million. Suppose this function is recursive or otherwise difficult to calculate (expensive) and will be called over and over with the same inputs. What might be a good data structure to use to easily create and retrieve the possible values of F(n,p) so that I don't have to recalculate it every time? Total number of possible inputs should be 10 or 20 million at most.
Help please! and Thank you in advance!
You are talking about memoizing I presume. Trying to answer without a concrete exemple...
If you have to retrieve values from a small range (the 2nd parameter), say from 0 to 10^6, and that needs to be upper fast, and... you have enough memory, you could simply declare an array of int (long...), which basically stores the output values from all input.
To make things simple, let say the value 0 means there is no-value set
long *small = calloc(MAX, sizeof(*small)); // Calloc intializes to 0
then in a function that gives the value for a small range
if (small[ input ]) return small[ input ];
....calculate
small[ input ] = value;
+/-
+ Very fast
- Memory consumption takes the whole range, [ 0, MAX-1 ].
If you need to store arbitrary input, use the many libraries available (there are so many). Use a Set structure, that tells if the items exists or no.
if (set.exists( input )) return set.get( input );
....calculate
set.set( input, value );
+/-
+ less memory usage
+ still fast (said to be O(1))
- but, not as fast as a mere array
Add to this the hashed set (...), which are faster, as in terms of probabilities, values (hashes) are better distributed.
+/-
+ less memory usage than array
+ faster than a simple Set
- but, not as fast as a mere array
- use more memory than a simple Set

Evenly distribute scent in a collaborative diffusion matrix

I am trying to implement a collaborative diffusion behaviour for the first time and I am stuck with a problem. I understand how to make obstacles not diffusing scents and how to dampen scent for other friendly agents if one of them already pursues it. What I cannot understand is how do I make scents to evenly distribute in the matrix. It seems to me that every way of iterating in the matrix, determines the scent to distribute faster and better in the tiles I check later in the iteration. I mean if I iterate from i to maxRows and j to maxCols and then I apply the diffusion equation in every tile, on the 'north' and 'west' side of the goal I will have only one tile with the correct potential, whereas in the 'east' and 'south' side I will have more of them since their neighbours already have an assigned potential. How can I make the values distribute evenly? A double iteration from both extremities of the matrix and them combining the result seems like a memory-eater, as do a goal-oriented approach, since if I try to start from the goals and work around them I will have to execute the calculations for every goal and every tile with assigned potential, which means that I will have to do it for 4^(turn since starter diffusion)*nrOfGoals more every turn, which seems inefficient in a large matrix with a lot of goals.
My question is how can I evenly distribute the values in the matrix in an efficient way. I'm using the AiChallenge Ants, if that helps in any way!
I thank you in anticipation and I'm sorry for the grammar mistakes I've made in this post.
There may be a better solution, but the easiest way to do it is to use something similar to how a simple implementation of the game of life is done.
You have two buffers. One has the current "generation" of scent (and if you are doing multitasking, can be locked so only readers can look at it)... and another has the next generation of sent being calculated. You only "mix" scents from the current generation.
Once you are done, you swap the two buffers by simply changing the pointers / references.
Another way to think about it would be to have all the tiles calculate their new sent by asking their neighbors and averaging. When asked by their neighbors what their scent level is, they report their pre-calculated values from the previous pass. The new sent is only locked in once everyone has finished calculating.

Implementing a basic predator-prey simulation

I am trying to implement a predator-prey simulation, but I am running into a problem.
A predator searches for nearby prey, and eats it. If there are no near by prey, they move to a random vacant cell.
Basically the part I am having trouble with is when I advanced a "generation."
Say I have a grid that is 3x3, with each cell numbered from 0 to 8.
If I have 2 predators in 0 and 1, first predator 0 is checked, it moves to either cell 3 or 4
For example, if it goes to cell 3, then it goes on to check predator 1. This may seem correct
but it kind of "gives priority" to the organisms with lower index values.. I've tried using 2 arrays, but that doesn't seem to work either as it would check places where organisms are but aren't. ._.
Anyone have an idea of how to do this "fairly" and "correctly?"
I recently did a similar task in Java. Processing the predators starting from the top row to bottom not only gives "unfair advantage" to lower indices but also creates patterns in the movement of the both preys and predators.
I overcame this problem by choosing both row and columns in random ordered fashion. This way, every predator/prey has the same chance of being processed at early stages of a generation.
A way to randomize would be creating a linked list of (row,column) pairs. Then shuffle the linked list. At each generation, choose a random index to start from and keep processing.
More as a comment then anything else if your prey are so dense that this is a common problem I suspect you don't have a "population" that will live long. Also as a comment update your predators randomly. That is, instead of stepping through your array of locations take your list of predators and randomize them and then update them one by one. I think is necessary but I don't know if it is sufficient.
This problem is solved with a technique called double buffering, which is also used in computer graphics (in order to prevent the image currently being drawn from disturbing the image currently being displayed on the screen). Use two arrays. The first one holds the current state, and you make all decisions about movement based on the first array, but you perform the movement in the other array. Then, you swap their roles.
Edit: Looks like I didn't read your question thoroughly enough. Double buffering and randomization might both be needed, depending on how complex your rules are (but if there are no rules other than the ones you've described, randomization should suffice). They solve two distinct problems, though:
Double buffering solves the problem of correctness when you have rules where decisions about what will happen to a creature in a cell depends on the contents of neighbouring cells, and the decisions about neighbouring cells also depend on this cell. If you e.g. have a rule that says that if two predators are adjacent, they will both move away from each other, you need double buffering. Otherwise, after you've moved the first predator, the second one won't see any adjacent predator and will remain in place.
Randomization solves the problem of fairness when there are limited resources, such as when a prey only can be eaten by one predator (which seems to be the problem that concerned you).
How about some sort of round robin method. Put your predators in a circular linked list and keep a pointer to the node that's currently "first". Then, advance that first pointer to the next place in the list each generation. You could insert new predators either at the front or the back of your circular list with ease.

Most simple and fast method for audio activity detection?

Given is an array of 320 elements (int16), which represent an audio signal (16-bit LPCM) of 20 ms duration. I am looking for a most simple and very fast method which should decide whether this array contains active audio (like speech or music), but not noise or silence. I don't need a very high quality of the decision, but it must be very fast.
It occurred to me first to add all squares or absolute values of the elements and compare their sum with a threshold, but such a method is very slow on my system, even if it is O(n).
You're not going to get much faster than a sum-of-squares approach.
One optimization that you may not be doing so far is to use a running total. That is, in each time step, instead of summing the squares of the last n samples, keep a running total and update that with the square of the most recent sample. To avoid your running total from growing and growing over time, add an exponential decay. In pseudocode:
decay_constant=0.999; // Some suitable value smaller than 1
total=0;
for t=1,...
// Exponential decay
total=total*decay_constant;
// Add in latest sample
total+=current_sample;
if total>threshold
// do something
end
end
Of course, you'll have to tune the decay constant and threshold to suit your application. If this isn't fast enough to run in real time, you have a seriously underpowered DSP...
You might try calculating two simple "statistics" - first would be spread (max-min). Silence will have very low spread. Second would be variety - divide the range of possible values into say 16 brackets (= value range) and as you go through the elements, determine in which bracket that element goes. Noise will have similar numbers for all brackets, whereas music or speech should prefer some of them while neglecting others.
This should be possible to do in just one pass through the array and you do not need complicated arithmetics, just some addition and comparison of values.
Also consider some approximation, for example take only each fourth value, thus reducing the number of checked elements to 80. For audio signal, this should be okay.
I did something like this a while back. After some experimentation I arrived at a solution that worked sufficiently well in my case.
I used the rate of change in the cube of the running average over about 120ms. When there is silence (only noise that is) the expression should be hovering around zero. As soon as the rate starts increasing over a couple of runs, you probably have some action going on.
rate = cur_avg^3 - prev_avg^3
I used a cube because the square just wasn't agressive enough. If the cube is to slow for you, try using the square and a bitshift instead. Hope this helps.
It is clearly that the complexity should be at least O(n). Probably some simple algorithms that calculate some value range are good for the moment but I would look for Voice Activity Detection on web and for related code samples.

Fast way to implement 2D convolution in C

I am trying to implement a vision algorithm, which includes a prefiltering stage with a 9x9 Laplacian-of-Gaussian filter. Can you point to a document which explains fast filter implementations briefly? I think I should make use of FFT for most efficient filtering.
Are you sure you want to use FFT? That will be a whole-array transform, which will be expensive. If you've already decided on a 9x9 convolution filter, you don't need any FFT.
Generally, the cheapest way to do convolution in C is to set up a loop that moves a pointer over the array, summing the convolved values at each point and writing the data to a new array. This loop can then be parallelised using your favourite method (compiler vectorisation, MPI libraries, OpenMP, etc).
Regarding the boundaries:
If you assume the values to be 0 outside the boundaries, then add a 4 element border of 0 to your 2d array of points. This will avoid the need for `if` statements to handle the boundaries, which are expensive.
If your data wraps at the boundaries (ie it is periodic), then use a modulo or add a 4 element border which copies the opposite side of the grid (abcdefg -> fgabcdefgab for 2 points). **Note: this is what you are implicitly assuming with any kind of Fourier transform, including FFT**. If that is not the case, you would need to account for it before any FFT is done.
The 4 points are because the maximum boundary overlap of a 9x9 kernel is 4 points outside the main grid. Thus, n points of border needed for a 2n+1 x 2n+1 kernel.
If you need this convolution to be really fast, and/or your grid is large, consider partitioning it into smaller pieces that can be held in the processor's cache, and thus calculated far more quickly. This also goes for any GPU-offloading you might want to do (they are ideal for this type of floating-point calculation).
Here is a theory link
http://hebb.mit.edu/courses/9.29/2002/readings/c13-1.pdf
And here is a link to fftw, which is a pretty good FFT library that I've used in the past (check licenses to make sure it is suitable) http://www.fftw.org/
All you do is FFT your image and kernel (the 9x9 matrix). Multiply together, then back transform.
However, with a 9x9 matrix you may still be better doing it in real coordinates (just with a double loop over the image pixels and the matrix). Try both ways!
Actually you don't need to use a FFT size large enough to hold the entire image. You can do a lot of smaller overlapping 2d ffts. You can search for "fast convolution" "overlap save" "overlap add".
However, for a 9x9 kernel. You may not see much advantage speedwise.

Resources