Fast way to store different length Matrix rows into Cell - arrays

I have three matrices:
Values = [200x7] doubles
numOfStrings = [82 78 75 73 72 71 70] %(for example)
numOfColumns = [1 4]
The numOfColumns may contain any set of distinct values from 1 to 7. For example [1 2 3] or [4 7] or [1 5 6 7]. Obviously, biggest numOfColumns that can be is [1 2 3 4 5 6 7].
numOfColumns show the columns I want to get. numOfStrings shows the rows of that columns I need. I.e. in my example I want to get columns 1 and 4. So from the 1 column I want to get the 82 first rows and from the 4th get the 73 first rows.
i.e. if numOfColumns = [1 4] then
myCell{1} = Values( 1:82,1); % Values(numOfStrings(numOfColumn(1)), numOfColumn(1))
myCell{2} = Values( 1:73,4); % Values(numOfStrings(numOfColumn(2)), numOfColumn(2))
P.S. That's not necessary to save it into cell array. If you can offer any another solution Ill be grateful to you.
I'm looking for the fastest way to do this, which is likely to be by avoiding for loops and using vectorization.
I think a lot about sub2ind function. But I can't figure out how to return arrays of the different size! Because myCell{1} - [82x1] and myCell{2} - [73x1]. I suppose I can't use bsxfun or arrayfun.
RESULTS:
Using for loops alike #Will 's answer:
for jj = 1:numel(numOfColumns)
myCell{rowNumber(numOfColumns(jj)),numOfColumns(jj)} = Values( 1:numOfStrings(numOfColumns(jj)),numOfColumns(jj));
end
Elapsed time is 157 seconds
Using arrayfun alike #Yishai E 's answer:
myCell(sub2ind(size(myCell),rowNumber(numOfColumns),numOfColumns)) = arrayfun( #(nOC) Values( 1:numOfStrings(nOC),nOC), numOfColumns, 'UniformOutput', false);
Elapsed time is 179 seconds
Using bsxfun alike #rahnema1 's answer:
idx = bsxfun(#ge,numOfStrings , (1:200).');
extracted_values = Values (idx);
tempCell = mat2cell(extracted_values,numOfStrings);
myCell(sub2ind(size(myCell),rowNumber(numOfColumns),numOfColumns)) = myCell(numOfColumns)';
Elapsed time is 204 seconds
SO, I got a lot of working answers, and some of them are vectorized as I asked, but for loops still fastest!

This should solve your problem, using arrayfun, which "vectorizes" the application of the indexing function. Not really, but it doesn't call the interpreter for each entry from numOfColumns. Interestingly enough, this is slower than the non-vectorized code in the other answer! (for 1e5 entries, 0.95 seconds vs. 0.23 seconds...)
arrayfun(#(nOC)Values(1:numOfStrings(nOC), nOC), numOfColumns, 'UniformOutput', false)

% Get number of elements in NUMOFCOLUMNS
n = numel(numOfColumns);
% Set up output
myCell = cell(1,n);
% Loop through all NUMOFCOLUMNS values, storing to cell
for i = 1:n
myCell{i} = Values(1:numOfStrings(numOfColumns(i)), numOfColumns(i));
end
Which for your example gives output
myCell =
[82x1 double] [73x1 double]

You can create logical indices for extraction of the desired elements :
idx = bsxfun(#ge,numOfStrings , (1:200).');
that in MATLAB R2016b or Octave (thanks to broadcasting/expansion) can be written as:
idx = numOfStrings >= (1:200).';
extract values:
extracted_values = Values (idx);
then using mat2cell convert data to cell :
myCell = mat2cell(extracted_values,numOfStrings);
all in one line :
myCell = mat2cell(Values (numOfStrings >= (1:200).'), numOfStrings);
If you want to use different numOfColumns with different sizes to extract elements of the cell you can each time do this:
result = myCell(numOfColumns);
If both numOfStrings and numOfColumns change and you need to compute the result once do this:
%convert numOfColumns to logical index:
numcols_logical = false(1,7);
numcols_logical(numOfColumns) = true;
extracted_values = Values ((numOfStrings .* numcols_logical) >= (1:200).');
if you need cell array
result= mat2cell(extracted_values,numOfStrings(numcols_logical));

Related

reversing shuffling of array by indexing

I have a matrix whose columns which was shuffled according to some index. I know want to find the index that 'unshuffles' the array back into its original state.
For example:
myArray = [10 20 30 40 50 60]';
myShuffledArray = nan(6,3)
myShufflingIndex = nan(6,3)
for x = 1:3
myShufflingIndex(:,x) = randperm(length(myArray))';
myShuffledArray(:,x) = myArray(myShufflingIndex(:,x));
end
Now I want to find a matrix myUnshufflingIndex, which reverses the shuffling to get an array myUnshuffledArray = [10 20 30 40 50 60; 10 20 30 40 50 60; 10 20 30 40 50 60]'
I expect to use myUnshufflingIndex in the following way:
for x = 1:3
myUnShuffledArray(:,x) = myShuffledArray(myUnshufflingIndex(:,x), x);
end
For example, if one column in myShufflingIndex = [2 4 6 3 5 1]', then the corresponding column in myUnshufflingIndex is [6 1 4 2 5 3]'
Any ideas on how to get myUnshufflingIndex in a neat vectorised way? Also, is there a better way to unshuffle the array columnwise than in a loop?
You can get myUnshufflingIndex with a single call to sort:
[~, myUnshufflingIndex] = sort(myShufflingIndex, 1);
Alternatively, you don't even need to compute myUnshufflingIndex, since you can just use myShufflingIndex on the left hand side of the assignment to unshuffle the data:
for x = 1:3
myUnShuffledArray(myShufflingIndex(:, x), x) = myShuffledArray(:, x);
end
And if you'd like to avoid a for loop while unshuffling, you can vectorize it by adding an offset to each column of your index, turning it into a matrix of linear indices instead of just row indices:
[nRows, nCols] = size(myShufflingIndex);
myUnshufflingIndex = myShufflingIndex+repmat(0:nRows:(nRows*(nCols-1)), nRows, 1);
myUnShuffledArray = nan(nRows, nCols); % Preallocate
myUnShuffledArray(myUnshufflingIndex) = myShuffledArray;

Matlab: Creating a blockwise permutation

I have a vector from 1 to 40 and want to shuffle it in such a way that each block of four integers (ten blocks in total) are shuffled only with themselves.
For example: 3 4 2 1 | 7 6 5 8 | 9 11 10 12 | ...
My original idea was to append ten permutation vectors to eachother and then add a 1 to 40 vector to the big permutation vector, but it didn't work at all as expected and was logically wrong.
Has anyone an idea how to solve this?
data = 10:10:120; % input: values to be permuted
group_size = 4; % input: group size
D = reshape(data, group_size, []); % step 1
[~, ind] = sort(rand(size(D)), 1); % step 2
result = D(bsxfun(#plus, ind, (0:size(D,2)-1)*group_size)); % step 3
result = result(:).'; % step 4
Example result:
result =
20 10 30 40 60 50 70 80 110 100 120 90
How it works
Reshape the data vector into a matrix D, such that each group is a column. This is done with reshape.
Generate a matrix, ind, where each column contains the indices of a permutation of the corresponding column of D. This is done generating independent, uniform random values (rand), sorting each column, and getting the indices of the sorting (second output of sort).
Apply ind as column indices into D. This requires converting to linear indices, which can be done with bsxfun (or with sub2ind, but that's usually slower).
Reshape back into a vector.
You can use A = A(randperm(length(A))) to shuffle an array.
Example in Octave:
for i = 1:4:40
v(i:i+3) = v(i:i+3)(randperm(4));
end

MATLAB function to replace randi to generate a matrix

I have a matlab problem to solve. In have two vectores that limit my space, x_low and x_high. The matrix pos needs to have values within this spaces and each column of the matrix has different bounds given by the two vectores. Now my problem is that randi gives valus between two integers but i need to change the bounds for each columns. There is another way to use randi or a different matlab function to do this?
I know there are better codes to do this but i'm starting to use matlab and i know to do it this way, any aid is welcome
x_low = [Io_low, Iirr_low, Rp_low, Rs_low, n_low]; % vector of constant values
x_high = [Io_high, Iirr_high, Rp_high, Rs_high, n_high]; % vector of constant values
pos = rand(particles, var);
var = length(x_high);
for i = 1: particles % rows
for k = 1: var %columns
if pos(i, k) < x_low(k) || pos(i, k) > x_high(k) % if the position is out of bounder
pos(i, k) = randi(x_low(k), x_high(k), 1); % fill it with a particle whithin the bounderies
end
end
end
If I understand correctly, you need to generate a matrix with integer values such that each column has different lower and upper limits; and those lower and upper limits are inclusive.
This can be done very simply with
rand (to generate random numbers between 0 and 1 ),
bsxfun (to take care of the lower and upper limits on a column basis), and
round (so that the results are integer values).
Let the input data be defined as
x_low = [1 6 11]; %// lower limits
x_high = [3 10 100]; %// upper limits
n_rows = 7; %// number of columns
Then:
r = rand(n_rows, numel(x_low)); %// random numbers between 0 and 1
r = floor(bsxfun(#times, r, x_high-x_low+1)); %// adjust span and round to integers
r = bsxfun(#plus, r, x_low); %// adjust lower limit
gives something like
r =
2 7 83
3 6 93
2 6 22
3 10 85
3 7 96
1 10 90
2 8 57
If you need to fill in values only at specific entries of matrix pos, you can use something like
ind = bsxfun(#lt, pos, x_low) | bsxfun(#gt, pos, x_high); %// index of values to replace
pos(ind) = r(ind);
This a little overkill, because the whole matrixd r is generated only to use some of its entries. To generate only the needed values the best way is probably to use loops.
You can use cellfun for this. Something like:
x_low = [Io_low, Iirr_low, Rp_low, Rs_low, n_low];
x_high = [Io_high, Iirr_high, Rp_high, Rs_high, n_high];
pos = cell2mat(cellfun(#randi, mat2cell([x_low' x_high'], ones(numel(x_low),1), 1), repmat({[particles 1]}, [numel(x_low) 1)])))';
Best,

indices of occurence of each row in MATLAB

I have two matrices, A and B. (B is continuous like 1:n)
I need to find all the occurrences of each individual row of B in A, and store those row indices accordingly in cell array C. See below for an example.
A = [3,4,5;1,3,5;1,4,3;4,2,1]
B = [1;2;3;4;5]
Thus,
C = {[2,3,4];[4];[1,2,3];[1,3,4];[1,2]}
Note C does not need to be in a cell array for my application. I only suggest it because the row vectors of C are of unequal length. If you can suggest a work-around, this is fine too.
I've tried using a loop running ismember for each row of B, but this is too slow when the matrices A and B are huge, with around a million entries. Vectorized code is appreciated.
(To give you context, the purpose of this is to identify, in a mesh, those faces that are attached to a single vertex. Note I cannot use the function edgeattachments because my data are not of the form "TR" in triangulation representation. All I have is a list of faces and list of vertices.)
Well, the best answer for this would require knowledge of how A is filled. If A is sparse, that is, if it has few columns values and B is quite large, then I think the best way for memory saving may be using a sparse matrix instead of a cell.
% No fancy stuff, just fast and furious
bMax = numel(B);
nRows = size(A,1);
cLogical = sparse(nRows,bMax);
for curRow = 1:nRows
curIdx = A(curRow,:);
cLogical(curRow,curIdx) = 1;
end
Answer:
cLogical =
(2,1) 1
(3,1) 1
(4,1) 1
(4,2) 1
(1,3) 1
(2,3) 1
(3,3) 1
(1,4) 1
(3,4) 1
(4,4) 1
(1,5) 1
(2,5) 1
How to read the answer. For each column the rows show the indexes that the column index appears in A. That is 1 appears in rows [2 3 4], 2 appear in row [4], 3 rows [1 2 3], 4 row [1 3 4], 5 in row [1 2].
Then you can use cLogical instead of a cell as an indexing matrix in the future for your needs.
Another way would be to allocate C with the expected value for how many times an index should appear in C.
% Fancier solution using some assumed knowledge of A
bMax = numel(B);
nRows = size(A,1);
nColumns = size(A,2);
% Pre-allocating with the expected value, an attempt to reduce re-allocations.
% tic; for rep=1:10000; C = mat2cell(zeros(bMax,nColumns),ones(1,bMax),nColumns); end; toc
% Elapsed time is 1.364558 seconds.
% tic; for rep=1:10000; C = repmat({zeros(1,nColumns)},bMax,1); end; toc
% Elapsed time is 0.606266 seconds.
% So we keep the not fancy repmat solution
C = repmat({zeros(1,nColumns)},bMax,1);
for curRow = 1:nRows
curIdxMsk = A(curRow,:);
for curCol = 1:nColumns
curIdx = curIdxMsk(curCol);
fillIdx = ~C{curIdx};
if any(fillIdx)
fillIdx = find(fillIdx,1);
else
fillIdx = numel(fillIdx)+1;
end
C{curIdx}(fillIdx) = curRow;
end
end
% Squeeze empty indexes:
for curRow = 1:bMax
C{curRow}(~C{curRow}) = [];
end
Answer:
>> C{:}
ans =
2 3 4
ans =
4
ans =
1 2 3
ans =
1 3 4
ans =
1 2
Which solution will performs best? You do a performance test in your code because it depends on how big is A, bMax, the memory size of your computer and so on. Yet, I'm still curious with solutions other people can do for this x). I liked chappjc's solution although it has the cons that he has pointed out.
For the given example (10k times):
Solution 1: Elapsed time is 0.516647 seconds.
Solution 2: Elapsed time is 4.201409 seconds (seems that solution 2 is a bad idea hahaha, but since it was created to the specific issue of A having many rows it has to be tested in those conditions).
chappjc' solution: Elapsed time is 2.405341 seconds.
We can do it without making any assumptions about B. Try this use of bsxfun and mat2cell:
M = squeeze(any(bsxfun(#eq,A,permute(B,[3 2 1])),2)); % 4x3x1 #eq 1x1x5 => 4x3x5
R = sum(M); % 4x5 -> 1x5
[ii,jj] = find(M);
C = mat2cell(ii,R)
The cells in C above will be column vectors rather than rows as in your example. To make the cells contain row vectors, use C = mat2cell(ii',1,R)' instead.
My only concern is that mat2cell could be slow for millions of values of R, but if you want your output in a cell, I'm not sure how much better you can do. EDIT: If you can deal with a sparse matrix like in Werner's first solution with the loop, replace the last line of the above with the following:
>> Cs = sparse(ii,jj,1)
Cs =
(2,1) 1
(3,1) 1
(4,1) 1
(4,2) 1
(1,3) 1
(2,3) 1
(3,3) 1
(1,4) 1
(3,4) 1
(4,4) 1
(1,5) 1
(2,5) 1
Unfortunately, bsxfun will probably run out of memory if both size(A,1) and numel(B) are large! You may have to loop over the elements of A or B if memory becomes an issue. Here's one way to do it by looping over your vertexes in B:
for i=1:numel(B), C{i} = find(any(A==B(i),2)); end
Yup, that easy. Cell array growing is extremely fast in MATLAB as it similar to a sequence container that stores contiguous references to the data, rather than keeping the data itself contiguous. Perhaps ismember was the bottleneck in your test.

Selecting elements from an array in MATLAB

I know that in MATLAB, in the 1D case, you can select elements with indexing such as a([1 5 3]), to return the 1st, 5th, and 3rd elements of a. I have a 2D array, and would like to select out individual elements according to a set of tuples I have. So I may want to get a(1,3), a(1,4), a(2,5) and so on. Currently the best I have is diag(a(tuples(:,1), tuples(:,2)), but this requires a prohibitive amount of memory for larger a and/or tuples. Do I have to convert these tuples into linear indices, or is there a cleaner way of accomplishing what I want without taking so much memory?
Converting to linear indices seems like a legitimate way to go:
indices = tuples(:, 1) + size(a,1)*(tuples(:,2)-1);
selection = a(indices);
Note that this is also implement in the Matlab built-in solution sub2ind, as in nate'2 answer:
a(sub2ind(size(a), tuples(:,1),tuples(:,2)))
however,
a = rand(50);
tuples = [1,1; 1,4; 2,5];
start = tic;
for ii = 1:1e4
indices = tuples(:,1) + size(a,1)*(tuples(:,2)-1); end
time1 = toc(start);
start = tic;
for ii = 1:1e4
sub2ind(size(a),tuples(:,1),tuples(:,2)); end
time2 = toc(start);
round(time2/time1)
which gives
ans =
38
so although sub2ind is easier on the eyes, it's also ~40 times slower. If you have to do this operation often, choose the method above. Otherwise, use sub2ind to improve readability.
if x and y are vectors of the x y values of matrix a, then sub2und should solve your problem:
a(sub2ind(size(a),x,y))
For example
a=magic(3)
a =
8 1 6
3 5 7
4 9 2
x = [3 1];
y = [1 2];
a(sub2ind(size(a),x,y))
ans =
4 1
you can reference the 2D matlab position with a 1D number as in:
a = [3 4 5;
6 7 8;
9 10 11;];
a(1) = 3;
a(2) = 6;
a(6) = 10;
So if you can get the positions in a matrix like this:
a([(col1-1)*(rowMax)+row1, (col2-1)*(rowMax)+row2, (col3-1)*(rowMax)+row3])
note: rowmax is 3 in this case
will give you a list of the elements at col1/row1 col2/row2 and col3/row3.
so if
row1 = col1 = 1
row2 = col2 = 2
row3 = col3 = 3
you will get:
[3, 7, 11]
back.

Resources