Reallocating/Erasing numpy array vs. new allocation in loop

Reallocating/Erasing numpy array vs. new allocation in loop - arrays

In my program I need to work with arrays roughly 500x500 to 1500x1500 within a function that is looped over 1000's of times. In each iteration, I need to start with an array that has the same form (whose dimensions are fixed across all iterations). The initial values will be:
[0 0 0 ... 1]
[0 0 0 ... 1]
....
However, the contents of the array will be modified within the loop. What is the most efficient way to "reset" the array to this format so I can pass the same array to the function every time without having to allocate a new set of memory every time? (I know the range of rows that were modified)
I have tried:
a[first_row_modified:last_row_modified,:] = 0.
a[first_row_modified:last_row_modified,:-1] = 1.
but it takes roughly the same amount of time as just creating a new array every time with the following:
a = zeros((sizeArray, sizeArray))
a[:,-1] = 1.
Is there a faster way to effectively "erase" the array and change the last column to ones? I think this is similar to this question, clearing elements of numpy array , although my array doesn't change sizes and i didn't see the definitive answer to the previously asked question.

No; I think the way you are doing it is about as fast as it gets.

Related

No. of paths in integer array

There is an integer array, for eg.
{3,1,2,7,5,6}
One can move forward through the array either each element at a time or can jump a few elements based on the value at that index. For e.g., one can go from 3 to 1 or 3 to 7, then one can go from 1 to 2 or 1 to 2(no jumping possible here), then one can go 2 to 7 or 2 to 5, then one can go 7 to 5 only coz index of 7 is 3 and adding 7 to 3 = 10 and there is no tenth element.
I have to only count the number of possible paths to reach the end of the array from start index.
I could only do it recursively and naively which runs in exponential time.
Somebody plz help.

My recommendation: use dynamic programming.
If this key word is sufficient and you want the challenge to find a possible solution on your own, dont read any further!
Here a possible DP-algorithm on the example input {3,1,2,7,5,6}. It will be your job to adjust on the general problem.
create array sol length 6 with just zeros in it. the array will hold the number of ways.
sol[5] = 1;
for (i = 4; i>=0;i--) {
sol[i] = sol[i+1];
if (i+input[i] < 6 && input[i] != 1)
sol[i] += sol[i+input[i]];
}
return sol[0];
runtime O(n)
As for the directed graph solution hinted in the comments :
Each cell in the array represents a node. Make an directed edge from each node to the node accessable. Basically you can then count more easily the number of ways by just looking at the outdegrees on the nodes (since there is no directed cycle) however it is a lot of boiler plate to actual program it.
Adjusting the recursive solution
another solution would be to pruning. This is basically equivalent to the DP-algorithm. The exponentiel time comes from the fact, that you calculate values several times. Eg function is recfunc(index). The initial call recFunc(0) calls recFunc(1) and recFunc(3) and so on. However recFunc(3) is bound to be called somewhen again, which leads to a repeated recursive calculation. To prune this you add a Map to hold all already calculated values. If you make a call recFunc(x) you lookup in the map if x was already calculated. If yes, return the stored value. If not, calculate, store and return it. This way you get a O(n) too.

Split array into smaller unequal-sized arrays dependend on array-column values

I'm quite new to MatLab and this problem really drives me insane:
I have a huge array of 2 column and about 31,000 rows. One of the two columns depicts a spatial coordinate on a grid the other one a dependent parameter. What I want to do is the following:
I. I need to split the array into smaller parts defined by the spatial column; let's say the spatial coordinate are ranging from 0 to 500 - I now want arrays that give me the two column values for spatial coordinate 0-10, then 10-20 and so on. This would result in 50 arrays of unequal size that cover a spatial range from 0 to 500.
II. Secondly, I would need to calculate the average values of the resulting columns of every single array so that I obtain per array one 2-dimensional point.
III. Thirdly, I could plot these points and I would be super happy.
Sadly, I'm super confused since I miserably fail at step I. - Maybe there is even an easier way than to split the giant array in so many small arrays - who knows..
I would be really really happy for any suggestion.
Thank you,
Arne

First of all, since you wish a data structure of array of different size you will need to place them in a cell array so you could try something like this:
res = arrayfun(#(x)arr(arr(:,1)==x,:), unique(arr(:,1)), 'UniformOutput', 0);
The previous code return a cell array with the array splitted according its first column with #(x)arr(arr(:,1)==x,:) you are doing a function on x and arrayfun(function, ..., 'UniformOutput', 0) applies function to each element in the following arguments (taken a single value of each argument to evaluate the function) but you must notice that arr must be numeric so if not you should map your values to numeric values or use another way to select this values.
In the same way you could do
uo = 'UniformOutput';
res = arrayfun(#(x){arr(arr(:,1)==x,:), mean(arr(arr(:,1)==x,2))), unique(arr(:,1)), uo, 0);
You will probably want to flat the returning value, check the function cat, you could do:
res = cat(1,res{:})
Plot your data depends on their format, so I can't help if i don't know how the data are, but you could try to plot inside a loop over your 'res' variable or something similar.

Step I indeed comes with some difficulties. Once these are solved, I guess steps II and III can easily be solved. Let me make some suggestions for step I:
You first define the maximum value (maxValue = 500;) and the step size (stepSize = 10;). Now it is possible to iterate through all steps and create your new vectors.
for k=1:maxValue/stepSize
...
end
As every resulting array will have different dimensions, I suggest you save the vectors in a cell array:
Y = cell(maxValue/stepSize,1);
Use the find function to find the rows of the entries for each matrix. At each step k, the range of values of interest will be (k-1)*stepSize to k*stepSize.
row = find( (k-1)*stepSize <= X(:,1) & X(:,1) < k*stepSize );
You can now create the matrix for a stepk by
Y{k,1} = X(row,:);
Putting everything together you should be able to create the cell array Y containing your matrices and continue with the other tasks. You could also save the average of each value range in a second column of the cell array Y:
Y{k,2} = mean( Y{k,1}(:,2) );
I hope this helps you with your task. Note that these are only suggestions and there may be different (maybe more appropriate) ways to handle this.

Repeated array sampling yields duplicates. Why? (Ruby)

In Ruby:
I sample element from array. I see duplicate (same element) every 30 samples or so. Sometimes as close as 5-6 samples apart. Why?
This is my code:
some_array = IO.readlines("file with 5000 unique elements")
some_array.shuffle!
#random_element = some_array.sample
puts #random_element

If you want n random non-duplicate elements from your array, you should call some_array.sample(n).
Sample doesn't guarantee that two consecutive calls will not contain duplicates; it guarantees that all elements chosen in one call won't.

Operating elementwise on an array

I'm trying to check if my arrays are returning nonsense by accessing out of bounds elements, in fortran. And I want to check these values are less than one, and if they are, change them to one.
This is the piece of my code causing issues:
lastNeighLabel=(/clusterLabel(jj-1,kk,ll), clusterLabel(jj,kk-1,ll), clusterLabel(jj,kk,ll-1)/)
LastNeighLabel contains the cluster label (between 1 and n, where n isthe total number of unique seperate clusters found) for the last neighbour in the x,y,z direction respectively.
When jj or kk or ll are 1, they try and access the 0th element in the array, and as FORTRAN counts from 1 in arrays, it tries to destroy the universe. I'm currently in a tangled mess of about 8 if/elseif statements trying to code for every eventuality. But I was hoping there was a way of operating on each element. So basically I'd like to say where((/jj-1,kk-1,ll-1/).lt.1) do clusterLabel(jj-1,kk,ll)=0 etc depending on which element is causing the problem.
But I can't think of a way to do that because where will only manipulate the variables passed to it, not a different array at the same index. Or am I wrong?
Will gladly edit if this doesn't make sense.

It is not obligatory that Fortran accesses arrays starting from one. Any starting value is allowed. If it more convenient to you to have a zero indexed array, declare the array as:
real, dimension (0:N-1, 0:M-1) :: array
Or
real, dimension (0:N, 0:M) :: array
and have the 0 indices be extra to catch special cases.
This might be another solution to your problem, since zero index values would be legal.

Another possible way to approach this, is to create an extended cluster label array (with index bounds starting at 0), which is equal to the cluster label array with a layer of zeroes tacked on the outside. You can then let your loop run safely over all values of jj, kk, and ll. It depends on the size of the array if this is a feasible solution.
integer :: extended_cluster_label(0:size(cluster_label,1), &
0:size(cluster_label,2), &
0:size(cluster_label,3) &
)
extended_cluster_label(0,:,:) = 0
extended_cluster_label(:,0,:) = 0
extended_cluster_label(:,:,0) = 0
extended_cluster_label(1:, 1:, 1:) = cluster_label

Maybe you could use a function?
real function f(A,i,j,k)
real :: A(:,:,:)
integer :: i,j,k
if (i==0.or.j==0.or.k==0) then
f=0
else
f=A(i,j,k)
endif
end function f
and then use f(clusterLabel,jj-1,kk,ll) etc.

Finding whether a value is equal to the value of any array element in MATLAB

Can anyone tell me if there is a way (in MATLAB) to check whether a certain value is equal to any of the values stored within another array?
The way I intend to use it is to check whether an element index in one matrix is equal to the values stored in another array (where the stored values are the indices of the elements which meet a certain criteria).
So, if the indices of the elements which meet the criteria are stored in the matrix below:
criteriacheck = [3 5 6 8 20];
Going through the main array (called array) and checking if the index matches:
for i = 1:numel(array)
if i == 'Any value stored in criteriacheck'
%# "Do this"
end
end
Does anyone have an idea of how I might go about this?

The excellent answer previously given by #woodchips applies here as well:
Many ways to do this. ismember is the first that comes to mind, since it is a set membership action you wish to take. Thus
X = primes(20);
ismember([15 17],X)
ans =
0 1
Since 15 is not prime, but 17 is, ismember has done its job well here.
Of course, find (or any) will also work. But these are not vectorized in the sense that ismember was. We can test to see if 15 is in the set represented by X, but to test both of those numbers will take a loop, or successive tests.
~isempty(find(X == 15))
~isempty(find(X == 17))
or,
any(X == 15)
any(X == 17)
Finally, I would point out that tests for exact values are dangerous if the numbers may be true floats. Tests against integer values as I have shown are easy. But tests against floating point numbers should usually employ a tolerance.
tol = 10*eps;
any(abs(X - 3.1415926535897932384) <= tol)

you could use the find command
if (~isempty(find(criteriacheck == i)))
% do something
end

Note: Although this answer doesn't address the question in the title, it does address a more fundamental issue with how you are designing your for loop (the solution of which negates having to do what you are asking in the title). ;)
Based on the for loop you've written, your array criteriacheck appears to be a set of indices into array, and for each of these indexed elements you want to do some computation. If this is so, here's an alternative way for you to design your for loop:
for i = criteriacheck
%# Do something with array(i)
end
This will loop over all the values in criteriacheck, setting i to each subsequent value (i.e. 3, 5, 6, 8, and 20 in your example). This is more compact and efficient than looping over each element of array and checking if the index is in criteriacheck.
NOTE: As Jonas points out, you want to make sure criteriacheck is a row vector for the for loop to function properly. You can form any matrix into a row vector by following it with the (:)' syntax, which reshapes it into a column vector and then transposes it into a row vector:
for i = criteriacheck(:)'
...

The original question "Can anyone tell me if there is a way (in MATLAB) to check whether a certain value is equal to any of the values stored within another array?" can be solved without any loop.
Just use the setdiff function.

I think the INTERSECT function is what you are looking for.
C = intersect(A,B) returns the values common to both A and B. The
values of C are in sorted order.
http://www.mathworks.de/de/help/matlab/ref/intersect.html
The question if i == 'Any value stored in criteriacheck can also be answered this way if you consider i a trivial matrix. However, you are proably better off with any(i==criteriacheck)