Why, if MATLAB is column-major, do some functions output row vectors? - arrays

MATLAB is well-known for being column-major. Consequently, manipulating entries of an array that are in the same column is faster than manipulating entries that are on the same row.
In that case, why do so many built-in functions, such as linspace and logspace, output row vectors rather than column vectors? This seems to me like a de-optimization...
What, if any, is the rationale behind this design decision?

It is a good question. Here are some ideas...
My first thought was that in terms of performance and contiguous memory, it doesn't make a difference if it's a row or a column -- they are both contiguous in memory. For a multidimensional (>1D) array, it is correct that it is more efficient to index a whole column of the array (e.g. v(:,2)) rather than a row (e.g. v(2,:)) or other dimension because in the row (non-column) case it is not accessing elements that are contiguous in memory. However, for a row vector that is 1-by-N, the elements are contiguous because there is only one row, so it doesn't make a difference.
Second, it is simply easier to display row vectors in the Command Window, especially since it wraps the rows of long arrays. With a long column vector, you will be forced to scroll for much shorter arrays.
More thoughts...
Perhaps row vector output from linspace and logspace is just to be consistent with the fact that colon (essentially a tool for creating linearly spaced elements) makes a row:
>> 0:2:16
ans =
0 2 4 6 8 10 12 14 16
The choice was made at the beginning of time and that was that (maybe?).
Also, the convention for loop variables could be important. A row is necessary to define multiple iterations:
>> for k=1:5, k, end
k =
1
k =
2
k =
3
k =
4
k =
5
A column will be a single iteration with a non-scalar loop variable:
>> for k=(1:5)', k, end
k =
1
2
3
4
5
And maybe the outputs of linspace and logspace are commonly looped over. Maybe? :)
But, why loop over a row vector anyway? Well, as I say in my comments, it's not that a row vector is used for loops, it's that it loops through the columns of the loop expression. Meaning, with for v=M where M is a 2-by-3 matrix, there are 3 iterations, where v is a 2 element column vector in each iteration. This is actually a good design if you consider that this involves slicing the loop expression into columns (i.e. chunks of contiguous memory!).

Related

Fastest method of computing every possible combination of four array choices?

I am working with four MATLAB arrays of size 169x14, 207x14, 94x14, and 108x14. I would like to produce a single array which has the linear addition of every possible row combination of the four arrays. For example, one such combination may be the 99th row of array1, the 72nd row of array2, 6th row of array3, and 27th row of array 4 added together as a single row. These arrays are named helm, chest, arm, leg - this is for a stat calculator of a video game.
My first attempt at this was the following:
for i = 1:length(lin_helm)
for k = 1:length(lin_arm)
for j = 1:length(lin_leg)
for g = 1:length(lin_leg)
armor_comb = [armor_comb;
i j k g helm_array(i,2:15)+chest_array(j,2:15)+arm_array(k,2:15)+leg_array(g,2:15)];
end
end
end
end
Which uses nested for loops for each array and simply adds the rows together (note that 'lin_X' are just numbered vectors for the row number and the rows of the array are 2:15 because the first column is a row iterator). The first four columns of this result array can be ignored, they are just denoting which rows were selected from the other arrays. To say the least, this is extremely slow.
I then tried omitting the last for loop to instead take the first three selections and add them as an entire matrix to the entire last array. This was done by taking the addition of the first three row selections and using a matrix of ones. I chose to do this for the largest array, chest, to save the most time.
for i = 1:length(lin_helm)
for k = 1:length(lin_arm)
for j = 1:length(lin_leg)
armor_comb = [armor_comb;
i*ones(length(lin_chest),1) j*ones(length(lin_chest),1) k*ones(length(lin_chest),1) lin_chest' ones(length(lin_chest),14).*[helm_array(i,2:15)+leg_array(j,2:15)+arm_array(k,2:15)]+chest_array(:,2:15)];
end
end
end
This was significantly faster, but still extremely slow compared to the total array size needed.
I am not sure how to make this process faster by using matrix math. To generalize my issue, I am trying to find the numerical array of all possible row additions of an AxN, BxN, CxN, and DxN where any given selection takes one row from each array with no repeats.
All online documentation I can find just says to use nested for loops because they assume your array sizes are small. This is unpractical for my application, so I am seeking help on how to use matrices (or another method) to speed up computation time.
For making indexes (the first columns of your final matrix), you can try something like this:
function i=indexes(i1, i2)
i=[kron(i1, ones(size(i2, 1), 1)) kron(ones(size(i1, 1), 1), i2)];
end
If a and b are column vectors of indexes 1, 2, ..., then indexes(a, b) will be the pairs of index combos, and you can repeat for additional indexing columns, e.g., indexes(indexes(a, b), c).
If you have the indexes, say ii, you can add up what you want with something like
array1(ii(:, 1), 2:15) + array2(ii(:, 2), 2:15)
Prepend with ii if you really need to.
This will be much faster than a naive loop like you have initially. E.g., on my somewhat old Matlab, this:
n=10;
a=(1:2*n)';
b=(1:3*n)';
c=(1:5*n)';
tic
ii=indexes(indexes(a,b),c);
toc
tic
jj=[];
k=1;
for i1=1:length(a)
for i2=1:length(b)
for i3=1:length(c)
jj(k, :)=[i1 i2 i3];
k=k+1;
end
end
end
toc
gives
Elapsed time is 0.003514 seconds.
Elapsed time is 0.754066 seconds.
If you pre-allocate the storage for the loop case like jj=zeros(size(ii));, that's also significantly faster, though still slower than the kron-based approach, like with n=100:
Elapsed time is 3.323197 seconds.
Elapsed time is 9.825276 seconds.

Calculating standard deviation on data stored in a cell array of cell arrays

I have a cell array A Mx3 in size where each entry contains a further cell-array Nx1 in size, for example when M=9 and N=5:
All data contained within in any given cell array is in vector format and of equal length. For example, A{1,1} contains 5 vectors 1x93 in size whilst A{1,2} contains 5 vectors 1x100 in size:
I wish to carry out this procedure on each of the 27 cells:
B = transpose(cell2mat(A{1,1}));
B = sort(B);
C = std(B,0,2); %Calculate standard deviation
Ultimately, the desired outcome would be, for the above example, 27 columns (9x3) containing the standard deviation results (padded with 0 or NaNs to handle differing lengths) printed in the order A{1,1}, A{1,2}, A{1,3}, A{2,1}, A{2,2}, A{2,3} and so forth.
I can do this by wrapping the above code into a loop to iterate over each one of the 27 cells in the correct order however, I was wondering if there was a clever cellfun or more succinct method to accomplish this particularly without the use of a loop?
You should probably realize that cellfun is essentially a glorified for loop over cells. There's simply extra error checking and all that to ensure that the whole thing works. In any case, yes it's possible to do what you're asking in a single cellfun call. Note that I am simply going to apply the same logic as you would have in a for loop with cellfun. Also note that because you're using cell arrays, you have no choice but to iterate over the entire master cell array. However, what you'll want to do is pad each resulting column vector in each output in the final cell array so that they all share the same length. We can do that with another two cellfun calls - one to determine the largest vector length and another to perform the padding operation.
Something like this could work:
% Step #1 - Sort the vectors in each cell array, then find row-wise std
B = cellfun(#(x) std(sort(cell2mat(x).'), 0, 2), A, 'un', 0);
% Step #2 - Determine the largest length vector and pad
sizes = cellfun(#numel, B);
B = cellfun(#(x) [x; nan(max(sizes(:)) - numel(x), 1)], B, 'un', 0);
The first line of code takes each element in A, converts each cell element into a N x 5 column matrix (i.e. cell2mat(x).'), we then sort each column individually with sort, then take the standard deviation row-wise. Because the output is ultimately a vector, we must make sure that the 'UniformOutput' flag is 0, or 'un=0'. Once we complete the standard deviation calculation, we determine the total number of elements for each resulting column vector for all cell elements, determine the largest size then use another cellfun call to pad these vectors so they all match the same size.
To finally get your desired output, you need to transpose the cell array, then unroll the elements in column major order. Remember that MATLAB accesses things in column major, so a common trick to get things in row-major (what you want) as opposed to column major is to first transpose, then unroll in column-major fashion to perform a row-major readout. Doing this in one line is tricky, so you'll need to not only transpose the cell array, you must use reshape to ensure that the elements are read out in row major format, but then ensuring that the result is placed in a row of cells, then call cell2mat so you can piece these vectors together. The final result should be a 27 column matrix where we have pieced all of these vectors together in a single row-wise fashion:
C = cell2mat(reshape(B.', 1, []));

How to repeat calculation for every value in an array and store the vector of each result in a new array in MATLAB [duplicate]

I've just started using for loops in matlab in programming class and the basic stuff is doing me fine, However I've been asked to "Use loops to create a 3 x 5 matrix in which the value of each element is its row number to the power of its column number divided by the sum of its row number and column number for example the value of element (2,3) is (2^3 / 2+3) = 1.6
So what sort of looping do I need to use to enable me to start new lines to form a matrix?
Since you need to know the row and column numbers (and only because you have to use loops), for-loops are a natural choice. This is because a for-loop will automatically keep track of your row and column number for you if you set it up right. More specifically, you want a nested for loop, i.e. one for loop within another. The outer loop might loop through the rows and the inner loop through the columns for example.
As for starting new lines in a matrix, this is extremely bad practice to do in a loop. You should rather pre-allocate your matrix. This will have a major performance impact on your code. Pre-allocation is most commonly done using the zeros function.
e.g.
num_rows = 3;
num_cols = 5;
M = zeros(num_rows,num_cols); %// Preallocation of memory so you don't grow your matrix in your loop
for row = 1:num_rows
for col = 1:num_cols
M(row,col) = (row^col)/(row+col);
end
end
But the most efficient way to do it is probably not to use loops at all but do it in one shot using ndgrid:
[R, C] = ndgrid(1:num_rows, 1:num_cols);
M = (R.^C)./(R+C);
The command bsxfun is very helpful for such problems. It will do all the looping and preallocation for you.
eg:
bsxfun(#(x,y) x.^y./(x+y), (1:3)', 1:5)

Mapping a 2D array into 1D array with variable column width

I know mapping 2D array into 1D array has been asked many times, but I did not find a solution that would fit a where the column count varies.
So I want get a 1-dimensional index from this 2-dimensional array
Col> _0____1____2__
Row 0 |_0__|_1__|_2__|
V 1 |_3__|_4__|
2 |_5__|_6__|_7__|
3 |_8__|_9__|
4 |_10_|_11_|_12_|
5 |_13_|_14_|
The normal formula index = row * columns + column does not work, since after the 2nd row the index is out of place.
What is the correct formula here?
EDIT:
The specific issue is that I have a list of items in with the layout like in the grid, but a one dimensional array for the data. So while looping through the elements in the UI, I need to get the correct data, but can only get the row and column for that element. I need to find a way to turn a row/column value into an index for the data-array
Bad picture trying to explain it
A truly optimal answer (or even a provably correct one) will depend on the language you are using and how it lays out memory for such arrays.
However, taking your question simply at face value, you have to know what the actual length of each row is in order to calculate a 1D index.
So either the row length follows some pattern that can be inferred from the data, or you have (or can write) a rlen = rowLength( 2dTable, RowNumber) function.
Then, depending on how big the tables are and how fast you need to run, you can calculate a 1D index from the 2d table by adding all the previous row lengths until the current row length is less than the 2d column index.
or build a 1d table of the row lengths (or commulative rowlengths) so you can scan it and so only call your rowlength function for each row only once.
With a better description of your problem, you might get a better answer...
For your example which alternates between 3 and 2 columns you can construct a formula:
index = (row / 2) * (3 + 2) + (row % 2 ? 3 : 0) + column
(C-like syntax, assuming integer division)
In general though, the one and only way to implement what you're doing here, jagged arrays, is to make an array of arrays, a.k.a. an Iliffe vector. That means, use the row number as index into an array of pointers which point to the individual row arrays containing the actual data.
You can have an additional 1D array having the length of the columns say "length". Then your formula is index=sum {length(i)}+column. i runs from 0 to row.

How can I extract a 1 dimentional row from a multidimentional matrix

I currently have a 3 dimensional matrix and I want to extract a single row (into the third dimension) from it by index (say matrix(2,1,:)). I initially anticipated that the result of this would be a 1 dimensional matrix however what I got was a 1 by 1 by n matrix. Usually this wouldn't be a problem but some of the functions I'm using don't like 3D matrices. For example see the problem replicated below:
threeDeeMatrix=rand(3,3,3);
oneDeeAttempt=threeDeeMatrix(1,1,:);
norm(oneDeeAttempt)
Which returns the error message:
Error using norm
Input must be 2-D.
This is because oneDeeAttempt is
oneDeeAttempt(:,:,1) =
0.8400
oneDeeAttempt(:,:,2) =
0.0700
oneDeeAttempt(:,:,3) =
0.7663
rather than [0.8400 0.0700 0.7663]
How can I strip these extra dimensions? The only solution I can come up with is to use a loop to manually copy the values but that seems a little excessive.
Using permute to rearrange the matrix
The solution (which I found in the final stages of asking this) is to use permute which rearranges the order of the dimensions (similar to a=a' for 2D matrices). Once the unit dimensions are last they are stripped from the matrix and it becomes 1 dimensional.
oneDee=permute(oneDeeAttempt,[3 1 2]) %rearrange so the previous third dimension is now the first
%the matrix is now 3 by 1 by 1 which becomes 3
Using squeeze to remove leading singleton dimensions
As pointed out by Luis Mendo squeeze will very simply remove these leading singleton dimensions without having to worry about which dimensions are non singleton
oneDee=squeeze(oneDeeAttempt);

Resources