Let's say we have an array x. We can find the maximum value of this array as follows:
maximum = max(x);
If I have two arrays, let's say x and y, I can find the array that contains the maximum value by using the command
maximum_array = max(x, y);
Let's say that this array is y. Then, I can find the maximum value by using the max command with argument y, as before with x:
maximum_value = max(y);
This two-step procedure could be performed with the following compact, one-liner command:
maximum_value = max(max(x, y));
But what happens when we have more than 2 arrays? As far as I know, the max function does not allow to compare more than two arrays. Therefore, I have to use max for pairs of arrays, and then find the max among the intermediate results (which involves also the use of additional variables). Of course, if I have, let's say, 50 arrays, this would be - and it really is - a tedius process.
Is there a more efficient approach?
Approach #1
Concatenate column vector versions of them along dim-2 with cat and then use maximium values with max along dim-2 to get the max.
Thus, assuming x, y and z to be the input arrays, do something like this -
%// Reshape all arrays to column vectors with (:) and then use cat
M = cat(2,x(:),y(:),z(:))
%// Use max along dim-2 with `max(..,[],2)` to get column vector
%// version and then reshape back to the shape of input arrays
max_array = reshape(max(M,[],2),size(x))
Approach #2
You can use ndims to find the number of dimensions in the input arrays and then concatenate along the dimension that is plus 1 of that dimension and finally find max along it to get the maximum values array. This would avoid all of that reshaping back and forth and thus could be more efficient and a more compact code as well -
ndimsp1 = ndims(x)+1 %// no. of dimensions plus 1
maxarr = max(cat(ndimsp1,x,y,z),[],ndimsp1) %// concatenate and find max
I think the easiest approach for a small set of arrays is to column-ify and concatenate:
maxValue = max([x(:);y(:)]);
For a large number of arrays in some data structure (e.g. a cell array or a struct), I simple loop would be best:
maxValue = max(cellOfMats{1}(:));
for k = 2:length(cellOfMats)
maxValue = max([maxValue;cellOfMats{k}(:)]);
end
For the pathological case of a large number of separate arrays with differing names, I say "don't do that" and put them in a data structure or use eval with a loop.
Related
I have array A (44x1) and B (41x1), and I want to count for both arrays how many times the elements are repeated. And if the repeated values are present in both arrays, I want their counting to be divided (for instance: value 0.5 appears 500 times in A and 350 times in B, so now divide 500 by 350).
I have to do this for bigger arrays as well, so I was thinking about using a looping (but no idea how to do it on MATLAB).
I got what I want on python:
import pandas as pd
data1 = pd.read_excel('C:/Users/Desktop/Python/data1.xlsx')
data2 = pd.read_excel('C:/Users/Desktop/Python/data2.xlsx')
for i in data1['Mag'].value_counts() & data2['Mag'].value_counts():
a = data1['Mag'].value_counts()/data2['Mag'].value_counts()
print(a)
break
Any idea of how to do the same on MATLAB? Thanks!
Since you can enumerate all valid earthquake magnitude values, you could use:
% Make up some data
A=randi([2 58],[100 1])/10;
B=randi([2 58],[20 1])/10;
% Round data to nearest tenth
%A=round(A,1); %uncomment if necessary
%B=round(B,1); %same
% Divide frequencies
validmags=0.2:0.1:5.8;
Afreqs=sum(double( abs(A-validmags)<1e-6 ),1); %relies on implicit expansion; A must be a column vector and validmags must be a row vector; dimension argument to sum() only to remind user; double() not really needed
Bfreqs=sum(double( abs(B-validmags)<1e-6 ),1); %same
Bfreqs./Afreqs, %for a fancier version: [{'Magnitude'} num2cell(validmags) ; {'Freq(B)/Freq(A)'} num2cell(Bfreqs./Afreqs)].'
The last line will produce NaN for 0/0, +Inf for nn/0, and 0 for 0/nn.
You could also use uniquetol, align the unique values of each vector, and divide the respective absolute frequencies. But I think the above approach is cleaner and easier to understand.
I have a cell array A Mx3 in size where each entry contains a further cell-array Nx1 in size, for example when M=9 and N=5:
All data contained within in any given cell array is in vector format and of equal length. For example, A{1,1} contains 5 vectors 1x93 in size whilst A{1,2} contains 5 vectors 1x100 in size:
I wish to carry out this procedure on each of the 27 cells:
B = transpose(cell2mat(A{1,1}));
B = sort(B);
C = std(B,0,2); %Calculate standard deviation
Ultimately, the desired outcome would be, for the above example, 27 columns (9x3) containing the standard deviation results (padded with 0 or NaNs to handle differing lengths) printed in the order A{1,1}, A{1,2}, A{1,3}, A{2,1}, A{2,2}, A{2,3} and so forth.
I can do this by wrapping the above code into a loop to iterate over each one of the 27 cells in the correct order however, I was wondering if there was a clever cellfun or more succinct method to accomplish this particularly without the use of a loop?
You should probably realize that cellfun is essentially a glorified for loop over cells. There's simply extra error checking and all that to ensure that the whole thing works. In any case, yes it's possible to do what you're asking in a single cellfun call. Note that I am simply going to apply the same logic as you would have in a for loop with cellfun. Also note that because you're using cell arrays, you have no choice but to iterate over the entire master cell array. However, what you'll want to do is pad each resulting column vector in each output in the final cell array so that they all share the same length. We can do that with another two cellfun calls - one to determine the largest vector length and another to perform the padding operation.
Something like this could work:
% Step #1 - Sort the vectors in each cell array, then find row-wise std
B = cellfun(#(x) std(sort(cell2mat(x).'), 0, 2), A, 'un', 0);
% Step #2 - Determine the largest length vector and pad
sizes = cellfun(#numel, B);
B = cellfun(#(x) [x; nan(max(sizes(:)) - numel(x), 1)], B, 'un', 0);
The first line of code takes each element in A, converts each cell element into a N x 5 column matrix (i.e. cell2mat(x).'), we then sort each column individually with sort, then take the standard deviation row-wise. Because the output is ultimately a vector, we must make sure that the 'UniformOutput' flag is 0, or 'un=0'. Once we complete the standard deviation calculation, we determine the total number of elements for each resulting column vector for all cell elements, determine the largest size then use another cellfun call to pad these vectors so they all match the same size.
To finally get your desired output, you need to transpose the cell array, then unroll the elements in column major order. Remember that MATLAB accesses things in column major, so a common trick to get things in row-major (what you want) as opposed to column major is to first transpose, then unroll in column-major fashion to perform a row-major readout. Doing this in one line is tricky, so you'll need to not only transpose the cell array, you must use reshape to ensure that the elements are read out in row major format, but then ensuring that the result is placed in a row of cells, then call cell2mat so you can piece these vectors together. The final result should be a 27 column matrix where we have pieced all of these vectors together in a single row-wise fashion:
C = cell2mat(reshape(B.', 1, []));
I'm quite new to MatLab and this problem really drives me insane:
I have a huge array of 2 column and about 31,000 rows. One of the two columns depicts a spatial coordinate on a grid the other one a dependent parameter. What I want to do is the following:
I. I need to split the array into smaller parts defined by the spatial column; let's say the spatial coordinate are ranging from 0 to 500 - I now want arrays that give me the two column values for spatial coordinate 0-10, then 10-20 and so on. This would result in 50 arrays of unequal size that cover a spatial range from 0 to 500.
II. Secondly, I would need to calculate the average values of the resulting columns of every single array so that I obtain per array one 2-dimensional point.
III. Thirdly, I could plot these points and I would be super happy.
Sadly, I'm super confused since I miserably fail at step I. - Maybe there is even an easier way than to split the giant array in so many small arrays - who knows..
I would be really really happy for any suggestion.
Thank you,
Arne
First of all, since you wish a data structure of array of different size you will need to place them in a cell array so you could try something like this:
res = arrayfun(#(x)arr(arr(:,1)==x,:), unique(arr(:,1)), 'UniformOutput', 0);
The previous code return a cell array with the array splitted according its first column with #(x)arr(arr(:,1)==x,:) you are doing a function on x and arrayfun(function, ..., 'UniformOutput', 0) applies function to each element in the following arguments (taken a single value of each argument to evaluate the function) but you must notice that arr must be numeric so if not you should map your values to numeric values or use another way to select this values.
In the same way you could do
uo = 'UniformOutput';
res = arrayfun(#(x){arr(arr(:,1)==x,:), mean(arr(arr(:,1)==x,2))), unique(arr(:,1)), uo, 0);
You will probably want to flat the returning value, check the function cat, you could do:
res = cat(1,res{:})
Plot your data depends on their format, so I can't help if i don't know how the data are, but you could try to plot inside a loop over your 'res' variable or something similar.
Step I indeed comes with some difficulties. Once these are solved, I guess steps II and III can easily be solved. Let me make some suggestions for step I:
You first define the maximum value (maxValue = 500;) and the step size (stepSize = 10;). Now it is possible to iterate through all steps and create your new vectors.
for k=1:maxValue/stepSize
...
end
As every resulting array will have different dimensions, I suggest you save the vectors in a cell array:
Y = cell(maxValue/stepSize,1);
Use the find function to find the rows of the entries for each matrix. At each step k, the range of values of interest will be (k-1)*stepSize to k*stepSize.
row = find( (k-1)*stepSize <= X(:,1) & X(:,1) < k*stepSize );
You can now create the matrix for a stepk by
Y{k,1} = X(row,:);
Putting everything together you should be able to create the cell array Y containing your matrices and continue with the other tasks. You could also save the average of each value range in a second column of the cell array Y:
Y{k,2} = mean( Y{k,1}(:,2) );
I hope this helps you with your task. Note that these are only suggestions and there may be different (maybe more appropriate) ways to handle this.
I have a two arrays within a <1x2 cell>. I want to permute those arrays. Of course, I could use a loop to permute each one, but is there any way to do that task at once, without using loops?
Example:
>> whos('M')
Name Size Bytes Class Attributes
M 1x2 9624 cell
>> permute(M,p_matrix)
This does not permute the contents of the two arrays within M.
I could use something like:
>> for k=1:size(M,2), M{k} = permute(M{k},p_matrix); end
but I'd prefer not to use loops.
Thanks.
This seems to work -
num_cells = numel(M) %// Number of cells in input cell array
size_cell = size(M{1}) %// Get sizes
%// Get size of the numeric array that will hold all of the data from the
%// input cell array with the second dimension representing the index of
%// each cell from the input cell array
size_num_arr = [size_cell(1) num_cells size_cell(2:end)]
%// Dimensions array for permuting with the numeric array holding all data
perm_dim = [1 3:numel(size_cell)+1 2]
%// Store data from input M into a vertically concatenated numeric array
num_array = vertcat(M{:})
%// Reshape and permute the numeric array such that the index to be used
%// for indexing data from different cells ends up as the final dimension
num_array = permute(reshape(num_array,size_num_arr),perm_dim)
num_array = permute(num_array,[p_matrix numel(size_cell)+1])
%// Save the numeric array as a cell array with each block from
%// thus obtained numeric array from its first to the second last dimension
%// forming each cell
size_num_arr2 = size(num_array)
size_num_arr2c = num2cell(size_num_arr2(1:end-1))
M = squeeze(mat2cell(num_array,size_num_arr2c{:},ones(1,num_cells)))
Some quick tests show that mat2cell would prove to be the bottleneck, so if you don't mind indexing into the intermediate numeric array variable num_array and use it's last dimension for an equivalent indexing into M, then this approach could be useful.
Now, another approach if you would like to preserve the cell format would be with arrayfun, assuming each cell of M to be a 4D numeric array -
M = arrayfun(#(x) num_array(:,:,:,:,x),1:N,'Uniform',0)
This seems to perform much better than with mat2cell in terms of performance.
Please note that arrayfun isn't a vectorized solution as most certainly it uses loops behind-the-scenes and seems like mat2cell is using for loops inside its source code, so please do keep all these issues in mind.
I have a function HermitePCECoefficients which takes as inputs multiple parameters, including a column vector y, and outputs a column vector Coefficients, the same length as y:
Coefficients=HermitePCECoefficients(grid,weights,indices,y,mu,sigma,normalized)
Suppose now that y is not a column vector, but a 2D array (matrix), and I want to run HermitePCECoefficients on each of its columns, storing the corresponding outputs in a array. Doing it with a for loop is simple and clear, but it takes forever:
Coefficients=zeros(size(y));
for i=1:size(y,2)
Coefficients(:,i)=HermitePCECoefficients(grid,weights,indices,y(:,i),mu,sigma,normalized);
end
Thus, I put bsxfun to the job. Since bsxfun only works with binary functions, I created a "dummy" binary function f, which is really only a function of a single argument:
f=#(a,b) HermitePCECoefficients(grid,weights,indices,a,mu,sigma,normalized);
Then used bsxfun this way:
Coefficients=bsxfun(f,y,omega_f);
This works fine, and it's much faster than the for loop (don't worry about omega_f, it's just a vector whose length corresponds to the number of columns in y).
Question 1: do you think this is the right way to use bsxfun in this context?
Question 2: maybe a better solution would be to directly modify HermitePCECoefficients, so that it could take a generic array y as input. Inside the function, this is the only line which requires y to be a column vector:
Coefficients(i)=dot(weights,y.*Psi)/norm;
weights and Psi are two column vectors, so if I pass an array y, MATLAB complains. Any suggestions on how to modify it?
Option 2 seems better (but only testing will tell). Just replace
dot(weights,y.*Psi)/norm
by
sum(bsxfun(#times, weights.*Psi, y)/norm)
or (probably faster)
(weights.*Psi).'*y / norm
Either of the above is equivalent to computing the vector [ dot(weights,y(:,1).*Psi)/norm, dot(weights,y(:,2).*Psi)/norm, ... ] for an arbitrary number of columns of y. Each entry of this vector is the result for a column of y.
You could use repmat on weights and Psi to replicate the vectors across the columns of y:
nc = size(y,2);
Coefficients = dot(repmat(weights,1,nc), y.*repmat(Psi,1,nc))/norm;