What I need to replace to improve perfomance of the alghorithm? - arrays

I am new in Matlab but I am trying.
I have the following code:
for t = 1:size(data,2)
b = data(t)/avevalue;
if b >= 1
cat1 = [repmat((avevalue),floor(b),1)',mod(data(t),15)];
else
cat1 = data(t);
end
modified = [modified,cat1];
end
The answer for
data=[16 18 16 25 17 7 15];
avevalue=15;
is
15 1 15 3 15 1 15 10 15 2 7 15 0
But when my array is more than 10000 elements it working very, impossibly slow (for 100000 nearly 3 minutes, for example). How can I increase its speed?

There are two main reasons for the slowness:
The fact that you are using a loop.
The output array is growing on each iteration.
you can improve runtime by trying the following approach:
%auxilliary array
divSumArray = ceil((data+1)/avevalue);
%defines output array
newArr = ones(1,sum(divSumArray))*avevalue;
%calculates modulo
moduloDataIndices = cumsum(divSumArray);
%assigning modulo in proper location
newArr(moduloDataIndices) = mod(data,avevalue);
the final result
15 1 15 3 15 1 15 10 15 2 7 15 0
Time measurement
I measured runtime for the following input:
n = 30000;
data = randi([0 99],n,1);
avevalue=15;
original algo:
Elapsed time is 11.783951 seconds.
optimized algo:
Elapsed time is 0.007728 seconds.

Related

MATLAB: extract values from 3d matrix at given row and column indcies using sub2ind 3d

I have 3d matrix A that has my data. At multiple locations defined by row and column indcies as shown by matrix row_col_idx I want to extract all data along the third dimension as shown below:
A = cat(3,[1:3;4:6], [7:9;10:12],[13:15;16:18],[19:21;22:24]) %matrix(2,3,4)
row_col_idx=[1 1;1 2; 2 3];
idx = sub2ind(size(A(:,:,1)), row_col_idx(:,1),row_col_idx(:,2));
out=nan(size(A,3),size(row_col_idx,1));
for k=1:size(A,3)
temp=A(:,:,k);
out(k,:)=temp(idx);
end
out
The output of this code is as follows:
A(:,:,1) =
1 2 3
4 5 6
A(:,:,2) =
7 8 9
10 11 12
A(:,:,3) =
13 14 15
16 17 18
A(:,:,4) =
19 20 21
22 23 24
out =
1 2 6
7 8 12
13 14 18
19 20 24
The output is as expected. However, the actual A and row_col_idx are huge, so this code is computationally expensive. Is there away to vertorize this code to avoid the loop and the temp matrix?
This can be vectorized using linear indexing and implicit expansion:
out = A( row_col_idx(:,1) + ...
(row_col_idx(:,2)-1)*size(A,1) + ...
(0:size(A,1)*size(A,2):numel(A)-1) ).';
The above builds an indexing matrix as large as the output. If this is unacceptable due to memory limiations, it can be avoided by reshaping A:
sz = size(A); % store size A
A = reshape(A, [], sz(3)); % collapse first two dimensions
out = A(row_col_idx(:,1) + (row_col_idx(:,2)-1)*sz(1),:).'; % linear indexing along
% first two dims of A
A = reshape(A, sz); % reshape back A, if needed
A more efficient method is using the entries of the row_col_idx vector for selecting the elements from A. I have compared the two methods for a large matrix, and as you can see the calculation is much faster.
For the A given in the question, it gives the same output
A = rand([2,3,10000000]);
row_col_idx=[1 1;1 2; 2 3];
idx = sub2ind(size(A(:,:,1)), row_col_idx(:,1),row_col_idx(:,2));
out=nan(size(A,3),size(row_col_idx,1));
tic;
for k=1:size(A,3)
temp=A(:,:,k);
out(k,:)=temp(idx);
end
time1 = toc;
%% More efficient method:
out2 = nan(size(A,3),size(row_col_idx,1));
tic;
for jj = 1:size(row_col_idx,1)
out2(:,jj) = [A(row_col_idx(jj,1),row_col_idx(jj,2),:)];
end
time2 = toc;
fprintf('Time calculation 1: %d\n',time1);
fprintf('Time calculation 2: %d\n',time2);
Gives as output:
Time calculation 1: 1.954714e+01
Time calculation 2: 2.998120e-01

Find repeated elements occuring more than Once

I have an array A as follows:
A = [7 7 10 10 10 15 1 1 15 15 7 16 17 1 18]. ';
How can I obtain all numbers which occur more than one times in my array? In this example the answer should be 1 7 10 15.
Here's another approach, just for variety:
[~, ind] = unique(A);
result = A;
result(ind) = [];
result = unique(result);
Solved it by using the following code
[ii,jj,kk]=unique(A);
repeated=ii(histc(kk,1:numel(ii))>1);

Matlab: how to rank 2D array and mark the ranking in the other 2D array?

I am considering an easy algorithm to rank my 2D array and mark their rank in the same size of the 2D array.
For example, I have a matrix in below:
[0 2 15 34;
0 15 21 24;
0 3 5 8;
1 14 23 29]
The output should be as follow:
[1 5 10 16;
1 10 12 14;
1 6 7 8;
4 9 13 15]
I am kind of new to matlab, I not sure if the matlab have the functionality to directly do it. Or it would be even better if you could provide some ideas for implementing the algorithm. Thank you very much!
If I understand correctly, you want to replace each element by its rank. I offer three ways to do it; the third seems to be what you want.
Let your example data be defined as
data = [0 2 15 34;
0 15 21 24;
0 3 5 8;
1 14 23 29];
This assigns equal ranks to equal data values (as in your example), but doesn't skip ranks in that case (your example seems to do so):
[~, ~, vv] = unique(data(:));
result = reshape(vv, size(data));
With your example data, this gives
result =
1 3 8 13
1 8 9 11
1 4 5 6
2 7 10 12
This assigns different ranks to equal data values (so skipping ranks is out of the question):
[~, vv] = sort(data(:));
[~, vv] = sort(vv);
result = reshape(vv, size(data));
With your example data,
result =
1 5 11 16
2 10 12 14
3 6 7 8
4 9 13 15
This assigns equal ranks to equal data values, and in that case it skips ranks:
[~, vv] = sort(data(:));
[~, vv] = sort(vv);
[~, jj, kk] = unique(data(:), 'first');
result = reshape(vv(jj(kk)), size(data));
With your example data,
result =
1 5 10 16
1 10 12 14
1 6 7 8
4 9 13 15
Another approach, single-line: for each entry, find how many other entries are smaller, and add 1:
result = reshape(sum(bsxfun(#lt,data(:),data(:).'))+1, size(data));

Assigning a single value to all cells within a specified time period, matrix format

I have the following example dataset which consists of the # of fish caught per check of a net. The nets are not checked at uniform intervals. The day of the check is denoted in julian days as well as the number of days the net had been fishing since last checked (or since it's deployment in the case of the first check)
http://textuploader.com/9ybp
Site_Number Check_Day_Julian Set_Duration_Days Fish_Caught
2 5 3 100
2 10 5 70
2 12 2 65
2 15 3 22
100 4 3 45
100 10 6 20
100 18 8 8
450 10 10 10
450 14 4 4
In any case, I would like to turn the raw data above into the following format:
http://textuploader.com/9y3t
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
2 0 0 100 100 100 70 70 70 70 70 65 65 22 22 22 0 0 0
100 0 45 45 45 20 20 20 20 20 20 8 8 8 8 8 8 8 8
450 10 10 10 10 10 10 10 10 10 10 4 4 4 4 0 0 0 0
This is a matrix which assigns the # of fish caught during the period to EACH of the days that were within that period. The columns of the matrix are Julian days, the rows are site numbers.
I have tried to do this with some matrix functions but I have had much difficulty trying to populate all the fields that are within the time period, but I do not necessarily have a row of data for?
I had posted my small bit of code here, but upon reflection, my approach is quite archaic and a bit off point. Can anyone suggest a method to convert the data into the matrix provided? I've been scratching my head and googling all day but now I am stumped.
Cheers,
C
Two answers, the second one is faster but a bit low level.
Solution #1:
library(IRanges)
with(d, {
ir <- IRanges(end=Check_Day_Julian, width=Set_Duration_Days)
cov <- coverage(split(ir, Site_Number),
weight=split(Fish_Caught, Site_Number),
width=max(end(ir)))
do.call(rbind, lapply(cov, as.vector))
})
Solution #2:
with(d, {
ir <- IRanges(end=Check_Day_Julian, width=Set_Duration_Days)
site <- factor(Site_Number, unique(Site_Number))
m <- matrix(0, length(levels(site)), max(end(ir)))
ind <- cbind(rep(site, width(ir)), as.integer(ir))
m[ind] <- rep(Fish_Caught, width(ir))
m
})
I don't see a super obvious matrix transformation here. This is all i've got assuming the raw data is in a data.frame called dd
dd$Site_Number<-factor(dd$Site_Number)
mm<-matrix(0, nrow=nlevels(dd$Site_Number), ncol=18)
for(i in 1:nrow(dd)) {
mm[as.numeric(dd[i,1]), (dd[i,2]-dd[i,3]):dd[i,2] ] <- dd[i,4]
}
mm

Finding the average of parameters controlled by other indices

I need some help in this problem
I have this matrix in MATLAB:
A = [ 25 1.2 1
28 1.2 2
17 2.6 1
18 2.6 2
23 1.2 1
29 1.2 2
19 15 1
22 15 2
24 2.6 1
26 2.6 2];
1st column is some measured values for temperature
2nd column is an index code representing the color (1.2:red,.....etc)
3rd column is the hour of taking the sample. Only at hours from 1 to 2
I want the matrix to be controlled by 2nd column as follows:
if it is 1.2, the program will find the average of all temperatures at hour 1 that
corresponds to 1.2
So, here ( 25 + 23 )/2 = 24
and also finds the average of all temperatures at hour 2 and that corresponds
to 1.2, ( 28 + 29 ) /2 = 28.5
and this average values:
[24
28.5]
will replace all temperature values at hours 1 and 2
that corresponds to 1.2 .
Then, it does the same thing for indices 2.6 and 15
So, the desired output will be:
B = [ 24
28.5
15.5
22
24
28.5
19
22
15.5
22]
My problem is in using the loop. I could do it for only one index at one run.
for example,
T=[];
index=1.2;
for i=1:length(A)
if A(i,2)==index
T=[T A(i,1)];
else
T=[T 0];
end
end
So, T is the extracted T that corresponds to 1.2 and other entries are zeros
Then, I wrote long code to find the average and at the end I could find the matrix
that corresponds to ONLY the index 1.2 :
B = [24
28.5
0
0
24
28.5
0
0
0
0]
But this is only for one index and it assigns zeros for the other indices. I can do this for all
indices in separate runs and then add the B's but this will take very long time since my real
matrix is 8760 by 5 .
I am sure that there is a shorter way to do that.
Thanks
Regards
Try this:
B = zeros(size(A, 1), 1);
C = unique(A(:, 2))';
T = [1 2];
for c = C,
for t = T,
I1 = find((A(:, 2) == c) & (A(:, 3) == t));
B(I1) = mean(A(I1, 1));
end
end
Edit
I think your expected answer is wrong for c = 2.6 and t = 1... Shouldn't it be (17 + 24)/2 = 20.5?
This can be done, perhaps more neatly, with accumarray:
[~, ~, ii] = unique(A(:,2)); %// indices corresponding to second col values
ind = [ii A(:,3)]; %// build 2D-indices for accumarray
averages = accumarray(ind, A(:,1), [], #mean); %// desired averages of first col
result = averages(sub2ind(max(ind), ind(:,1), ind(:,2))); %// repeat averages

Resources