dask.array.reshape very slow - arrays

I have an array that I iteratively build up like follows:
step1.shape = (200,200)
step2.shape = (200,200,200)
step3.shape = (200,200,200,200)
and then reshape to:
step4.shape = (200,200**3)
I do this because dask.array.atop doesn't seem to allow you to go from a shape like this: (200,200) -> (200,200**2). I think this is so that it is related to chunking and lazy evaluation.
When I do step4 and try to reshape it, dask seems to want to compute the matrix prior to reshaping it which results in significant computation time and memory use.
Is there a way to avoid this?
As requested, here is some dummy code:
def prod_mat(matrix_a,matrix_b):
#mat_a.shape = (300,...,300,200)
#mat_b.shape = (300, 200)
mat_a = matrix_a.reshape(-1,matrix_a.shape[-1])
#mat_a = (300**n,200)
mat_b = matrix_b.reshape(-1,matrix_b.shape[-1])
#mat_b = (300,200)
mat_temp = np.repeat(mat_a,matrix_b.shape[0],axis=0)*np.tile(mat_b.T,mat_a.shape[0]).T
new_dim = int(math.log(mat_temp.shape[0])/math.log(matrix_a.shape[0]))
new_shape = [matrix_a.shape[0] for n in range(new_dim)]
new_shape.append(-1)
result = mat_temp.reshape(tuple(new_shape))
#result.shape = (300,...,300,300,200)
return result
b = np.random.rand(300,200)
b = da.from_array(b,chunks=100)
c=da.atop(prod_mat,'ijk',b,'ik',b,'jk')
d=da.atop(prod_mat,'ijkl',c,'ijl',b,'kl')
e=da.atop(prod_mat,'ijklm',d,'ijkm',b,'lm')
f = e.sum(axis=-1)
f.reshape(300,300**3) ----> This is slow, as if it is using compute()

This computation isn't calling compute, instead it's stuck making a very very large graph. Generally speaking reshaping parallel arrays is pretty intense. Lots of your little chunks end up talking to lots of your other little chunks, creating havoc. This example is particularly bad.
Perhaps there is another way to produce your output in the correct shape initially?
Looking through the development logs it appears that this failure was actually anticipated during development: https://github.com/dask/dask/pull/758

Related

MATLAB - repmat values into cell array where individual cell elements have unequal size

I am trying to repeat values from an array (values) to a cell array where the individual elements have unequal sizes (specified by array_height and array_length).
I hope to apply this to a larger data set (containing ~100 x ~100 values) and my current solution is to have a line of code for each value (code example below). Surely there is a better way... Please could someone offer an alternative solution?
C = cell(3,2);
values = rand(3,2);
array_height = randi(10,3,2);
array_length = randi(10,3,2);
C{1,1} = repmat((values(1,1)),[array_height(1,1),array_length(1,1)]);
C{2,1} = repmat((values(2,1)),[array_height(2,1),array_length(2,1)]);
C{3,1} = repmat((values(3,1)),[array_height(3,1),array_length(3,1)]);
C{1,2} = repmat((values(1,2)),[array_height(1,2),array_length(1,2)]);
C{2,2} = repmat((values(2,2)),[array_height(2,2),array_length(2,2)]);
C{3,2} = repmat((values(3,2)),[array_height(3,2),array_length(3,2)]);
If you did this in a for loop, it might look something like this:
for i = 1:size(C,1)
for j = 1:size(C,2)
C{i,j} = repmat(values(i,j),[array_height(i,j),array_length(i,j)]);
end
end
However, if you are trying to generate or use this with a larger dataset, this code snippet likely will take forever! I suspect whatever your overall objective is can be better served by matlab's many optimizations for matrices and vectors, but without more information I can't help more than that.

How to create a vectorized version of this code in Matlab

I tried to vectorized my code but hit this roadblock and can't find an answer to it.
I have this array of 0 and 1, that works like a stopwatch. The way I create it is already vectorized. Here is a sample of it:
array of 0 and 1
Now, whenever the array is 1, a counter must be started multiplied by a sample rate, to give me the current measured time. And every time the first array is 0, the stopwatch must be reset for the next set of 1s. Here is a result of it.
array of calculated time
The code is this:
timearray = zeros(size(array01));
for ii = 1:size(array01)
if (array01(ii) == 0)
timearray(ii) = 0;
else
timearray(ii) = 0.005 + timearray (ii-1);
end
end
The issue with this for loop it's painfully slow. For a large array01 it takes many seconds and I'm pretty sure there's a clever way to do it, but I'm too dumb to see it.
Thanks for the help!
Here's a vectorized approach based on sparse matrices:
array01 = [0,0,0,0,0,1,1,1,1,0,0,0,0,0,0,1,1,1,1,0,0,0,0].'; % example data
sample_period = 0.005; % example data
t = sparse(1:numel(array01), cumsum([true; diff(array01(:))>0]).', array01);
timearray = sample_period*full(max(cumsum(t, 1).*t, [], 2));

Efficiently calculating weighted distance in MATLAB

Several posts exist about efficiently calculating pairwise distances in MATLAB. These posts tend to concern quickly calculating euclidean distance between large numbers of points.
I need to create a function which quickly calculates the pairwise differences between smaller numbers of points (typically less than 1000 pairs). Within the grander scheme of the program i am writing, this function will be executed many thousands of times, so even small gains in efficiency are important. The function needs to be flexible in two ways:
On any given call, the distance metric can be euclidean OR city-block.
The dimensions of the data are weighted.
As far as i can tell, no solution to this particular problem has been posted. The statstics toolbox offers pdist and pdist2, which accept many different distance functions, but not weighting. I have seen extensions of these functions that allow for weighting, but these extensions do not allow users to select different distance functions.
Ideally, i would like to avoid using functions from the statistics toolbox (i am not certain the user of the function will have access to those toolboxes).
I have written two functions to accomplish this task. The first uses tricky calls to repmat and permute, and the second simply uses for-loops.
function [D] = pairdist1(A, B, wts, distancemetric)
% get some information about the data
numA = size(A,1);
numB = size(B,1);
if strcmp(distancemetric,'cityblock')
r=1;
elseif strcmp(distancemetric,'euclidean')
r=2;
else error('Function only accepts "cityblock" and "euclidean" distance')
end
% format weights for multiplication
wts = repmat(wts,[numA,1,numB]);
% get featural differences between A and B pairs
A = repmat(A,[1 1 numB]);
B = repmat(permute(B,[3,2,1]),[numA,1,1]);
differences = abs(A-B).^r;
% weigh difference values before combining them
differences = differences.*wts;
differences = differences.^(1/r);
% combine features to get distance
D = permute(sum(differences,2),[1,3,2]);
end
AND:
function [D] = pairdist2(A, B, wts, distancemetric)
% get some information about the data
numA = size(A,1);
numB = size(B,1);
if strcmp(distancemetric,'cityblock')
r=1;
elseif strcmp(distancemetric,'euclidean')
r=2;
else error('Function only accepts "cityblock" and "euclidean" distance')
end
% use for-loops to generate differences
D = zeros(numA,numB);
for i=1:numA
for j=1:numB
differences = abs(A(i,:) - B(j,:)).^(1/r);
differences = differences.*wts;
differences = differences.^(1/r);
D(i,j) = sum(differences,2);
end
end
end
Here are the performance tests:
A = rand(10,3);
B = rand(80,3);
wts = [0.1 0.5 0.4];
distancemetric = 'cityblock';
tic
D1 = pairdist1(A,B,wts,distancemetric);
toc
tic
D2 = pairdist2(A,B,wts,distancemetric);
toc
Elapsed time is 0.000238 seconds.
Elapsed time is 0.005350 seconds.
Its clear that the repmat-and-permute version works much more quickly than the double-for-loop version, at least for smaller datasets. But i also know that calls to repmat often slow things down, however. So I am wondering if anyone in the SO community has any advice to offer to improve the efficiency of either function!
EDIT
#Luis Mendo offered a nice cleanup of the repmat-and-permute function using bsxfun. I compared his function with my original on datasets of varying size:
As the data become larger, the bsxfun version becomes the clear winner!
EDIT #2
I have finished writing the function and it is available on github [link]. I ended up finding a pretty good vectorized method for computing euclidean distance [link], so i use that method in the euclidean case, and i took #Divakar's advice for city-block. It is still not as fast as pdist2, but its must faster than either of the approaches i laid out earlier in this post, and easily accepts weightings.
You can replace repmat by bsxfun. Doing so avoids explicit repetition, therefore it's more memory-efficient, and probably faster:
function D = pairdist1(A, B, wts, distancemetric)
if strcmp(distancemetric,'cityblock')
r=1;
elseif strcmp(distancemetric,'euclidean')
r=2;
else
error('Function only accepts "cityblock" and "euclidean" distance')
end
differences = abs(bsxfun(#minus, A, permute(B, [3 2 1]))).^r;
differences = bsxfun(#times, differences, wts).^(1/r);
D = permute(sum(differences,2),[1,3,2]);
end
For r = 1 ("cityblock" case), you can use bsxfun to get elementwise subtractions and then use matrix-multiplication, which must speed up things. The implementation would look something like this -
%// Calculate absolute elementiwse subtractions
absm = abs(bsxfun(#minus,permute(A,[1 3 2]),permute(B,[3 1 2])));
%// Perform matrix multiplications with the given weights and reshape
D = reshape(reshape(absm,[],size(A,2))*wts(:),size(A,1),[]);

Is there a way to speed this up? (max-pooling)

This is the code:
A = rand(3,3,3);
P(1) = max(max(A(:,:,1)));
P(2) = max(max(A(:,:,2)));
P(3) = max(max(A(:,:,3)));
You can create P in one call:
%fastest solution for size(A,1)>size(A,2)
P = max(max(A,[],1),[],2)
%fastest solution for size(A,2)>size(A,1)
P = max(max(A,[],2),[],1)
For large matrices, it is faster to have a small intermediate result (the output of the first max call)
One approach is to collapse first two dimensions into one, and maximize along that dimension. I haven't tested it for speed, though.
P = max(reshape(A,[],size(A,3)));

MATLAB: vectorize filling of 3D-array

I would like to safe a certain amount of grayscale-images (->2D-arrays) as layers in a 3D-array.
Because it should be very fast for a realtime-application I would like to vectorize the following code, where m is the number of shifts:
for i=1:m
array(:,:,i)=imabsdiff(circshift(img1,[0 i-1]), img2);
end
nispio showed me a very advanced version, which you can see here:
I = speye(size(img1,2)); E = -1*I;
ii = toeplitz(1:m,[1,size(img1,2):-1:2]);
D = vertcat(repmat(I,1,m),E(:,ii));
data_c = shape(abs([double(img1),double(img2)]*D),size(data_r,1),size(data_r,2),m);
At the moment the results of both operations are not the same, maybe it shifts the image into the wrong direction. My knowledge is very limited, so I dont understand the code completely.
You could do this:
M = 16; N = 20; img1 = randi(255,M,N); % Create a random M x N image
ii = toeplitz(1:N,circshift(fliplr(1:N)',1)); % Create an indexing variable
% Create layers that are shifted copies of the image
array = reshape(img1(:,ii),M,N,N);
As long as your image dimensions don't change, you only ever need to create the ii variable once. After that, you can call the last line each time your image changes. I don't know for sure that this will give you a speed advantage over a for loop, but it is vectorized like you requested. :)
UPDATE
In light of the new information shared about the problem, this solution should give you an order of magnitudes increase in speed:
clear all;
% Set image sizes
M = 360; N = 500;
% Number of column shifts to test
ncols = 200;
% Create comparison matrix (see NOTE)
I = speye(N); E = -1*I;
ii = toeplitz([1:N],[1,N:-1:(N-ncols+2)]);
D = vertcat(repmat(I,1,ncols),E(:,ii));
% Generate some test images
img1 = randi(255,M,N);
img2 = randi(255,M,N);
% Compare images (vectorized)
data_c = reshape(abs([img2,img1]*D),M,N,ncols);
% Compare images (for loop)
array = zeros(M,N,ncols); % <-- Pre-allocate this array!
for i=1:ncols
array(:,:,i)=imabsdiff(circshift(img1,[0 i-1]),img2);
end
This uses matrix multiplication to do the comparisons instead of generating a whole bunch of shifted copies of the image.
NOTE: The matrix D should only be generated one time if your image size is not changing. Notice that the D matrix is completely independent of the images, so it would be wasteful to regenerate it every time. However, if the image size does change, you will need to update D.
Edit: I have updated the code to more closely match what you seem to be looking for. Then I throw the "original" for-loop implementation in to show that they give the same result. One thing worth noting about the vectorized version is that it has the potential to be very memory instensive. If ncols = N then the D matrix has N^3 elements. Even though D is sparse, things fall apart fast when you multiply D by the non-sparse images.
Also, notice that I pre-allocate array before the for loop. This is always good practice in Matlab, where practical, and it will almost invariably give you a large performance boost over the dynamic sizing.
If question is understood correctly, I think you need for loop
for v=1:1:20
array(:,:,v)=circshift(image,[0 v]);
end

Resources