Get unique values in matrix with Matlab - arrays

I'm looking for fastest way to get unique values in matrix with Matlab! I have a matrix like this:
1 2
1 2
1 3
1 5
1 23
2 1
3 1
3 2
3 2
3 2
4 17
4 3
4 17
and need to get something like this:
1 2
1 3
1 5
1 23
2 1
3 1
3 2
4 3
4 17
Actually I need unique values by combination of columns in each row.

Have a look at matlabs unique() function with the argument 'rows'.
C = unique(A,'rows')
https://de.mathworks.com/help/matlab/ref/unique.html

Related

How to remove reverse rows in a permutation matrix?

I'm looking for a quick way in MATLAB to do the following:
Given a permutation matrix of a vector, say [1, 2, 3], I would like to remove all duplicate reverse rows.
So the matrix P = perms([1, 2, 3])
3 2 1
3 1 2
2 3 1
2 1 3
1 3 2
1 2 3
becomes
3 2 1
3 1 2
2 3 1
You can noticed that, symetrically, the first element of each rows have to be bigger than the last one:
n = 4; %row size
x = perms(1:n) %all perms
p = x(x(:,1)>x(:,n),:) %non symetrical perms
Or you can noticed that the number of rows contained by the p matrix follows this OEIS sequence for each n and correspond to size(x,1)/2 so since perms output the permutation in reverse lexicographic order:
n = 4; %row size
x = perms(1:n) %all perms
p = x(1:size(x,1)/2,:) %non symetrical perms
You can use MATLAB's fliplr method to flip your array left to right, and then use ismember to find rows of P in the flipped version. At last, iterate all locations and select already found rows.
Here's some code (tested with Octave 5.2.0 and MATLAB Online):
a = [1, 2, 3];
P = perms(a)
% Where can row x be found in the left right flipped version of row x?
[~, Locb] = ismember(P, fliplr(P), 'rows');
% Set up logical vector to store indices to take from P.
n = length(Locb);
idx = true(n, 1);
% Iterate all locations and set already found row to false.
for I = 1:n
if (idx(I))
idx(Locb(I)) = false;
end
end
% Generate result matrix.
P_star = P(idx, :)
Your example:
P =
3 2 1
3 1 2
2 3 1
2 1 3
1 3 2
1 2 3
P_star =
3 2 1
3 1 2
2 3 1
Added 4 to the example:
P =
4 3 2 1
4 3 1 2
4 2 3 1
4 2 1 3
4 1 3 2
4 1 2 3
3 4 2 1
3 4 1 2
3 2 4 1
3 2 1 4
3 1 4 2
3 1 2 4
2 4 3 1
2 4 1 3
2 3 4 1
2 3 1 4
2 1 4 3
2 1 3 4
1 4 3 2
1 4 2 3
1 3 4 2
1 3 2 4
1 2 4 3
1 2 3 4
P_star =
4 3 2 1
4 3 1 2
4 2 3 1
4 2 1 3
4 1 3 2
4 1 2 3
3 4 2 1
3 4 1 2
3 2 4 1
3 1 4 2
2 4 3 1
2 3 4 1
As demanded in your question (at least from my understanding), rows are taken from top to bottom.
Here's another approach:
result = P(all(~triu(~pdist2(P,P(:,end:-1:1)))),:);
pdist computes the distance between rows of P and rows of P(:,end:-1:1).
~ negates the result, so that true corresponds to coincident pairs.
triu keeps only the upper triangular part of the matrix, so that only one of the two rows of the coincident pair will be removed.
~ negates back, so that true corresponds to non-coincident pairs.
all gives a row vector with true for rows that should be kept (because they do not coincide with any previous row).
This is used as a logical index to select rows of P.

Unique Columns Across an Array?

I have an array structured like so:
a = [1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 5 5 5 5;
1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 2 2 3 3 1 1 1 2 3 4 4 4 1 1 1 1 2 2 3 3];
Pretty much, it's a 2 by n (I simplified my matrix in this question with reduced number of columns for simplicity's sake), no real pattern. I want to be able to find the unique number of columns. So in this simplified example, I can (but it'll take a while) count by hand and noticed that my unique matrix b is:
b= 1 1 2 2 2 3 3 3 3 4 5 5
1 2 1 2 3 1 2 3 4 1 2 3
In MATLAB, I can do something like
size(b,2)
To get the number of unique columns. In this example
size(b,2) = 12
My question is, how do I go from matrix a to matrix b so that I can do this computationally for very large n dimensional matrices that I have?
Use unique:
a = [1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 5 5 5 5;
1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 2 2 3 3 1 1 1 2 3 4 4 4 1 1 1 1 2 2 3 3];
% Transpose to leverage the rows flag, then transpose back
b = unique(a.', 'rows').';
Which returns:
b =
1 1 2 2 2 3 3 3 3 4 5 5
1 2 1 2 3 1 2 3 4 1 2 3

Max within groups in Matlab

I have the following matrix:
[ 2 5 7 8 1 3 4 6 5 7 3 1;
1 1 1 1 2 2 2 2 3 3 3 3;]
The first row represents values and the second characteristic
I want to get the max value if the value in the second row is the same, i.e. their characteristic is the same. So, what I would like to have is:
[ 8 6 7], since 8 is the highest value when the second row is 1, 6 when the second row is is 2, and 7 when the second row is 3. I can do it with a loop, but I would like vectorized solution, and if possible of course, in one line.
accumarray does exactly what you want
x=[ 2 5 7 8 1 3 4 6 5 7 3 1; 1 1 1 1 2 2 2 2 3 3 3 3;]
accumarray(x(2,:)',x(1,:)',[],#max)

All row-combinations of a matrix in a new matrix with matlab

I have got a question regarding all the combinations of matrix-rows in Matlab.
I currently have a matrix with the following structure:
1 2
1 3
1 4
2 3
2 4
3 4
Now I want to get all the possible combinations of these "pairs" without using a number twice in the same row:
1 2 3 4
1 3 2 4
1 4 2 3
And it must be possible to make it with n-"doublecolumns". Which means, when my pair-matrix goes for example until "5 6", i want to create the matrix with 3 of these doublecolumns:
1 2 3 4 5 6
1 2 3 5 4 6
1 2 3 6 4 5
1 3 2 4 5 6
1 3 2 5 4 6
....
I hope you understand what I mean :)
Any ideas how to solve this?
Thanks and best regard
Jonas
M = [1 2
1 3
1 4
2 3
2 4
3 4]; %// example data
n = floor(max(M(:))/2); %// size of tuples. Compute this way, or set manually
p = nchoosek(1:size(M,1), n).'; %'// generate all n-tuples of row indices
R = reshape(M(p,:).', n*size(M,2), []).'; %// generate result...
R = R(all(diff(sort(R.'))),:); %'//...removing combinations with repeated values

Reshape acast() remove missing values

I have this dataframe:
df <- data.frame(subject = c(rep("one", 20), c(rep("two", 20))),
score1 = sample(1:3, 40, replace=T),
score2 = sample(1:6, 40, replace=T),
score3 = sample(1:3, 40, replace=T),
score4 = sample(1:4, 40, replace=T))
subject score1 score2 score3 score4
1 one 2 4 2 2
2 one 3 3 1 2
3 one 1 2 1 3
4 one 3 4 1 2
5 one 1 2 2 3
6 one 1 5 2 4
7 one 2 5 3 2
8 one 1 5 1 3
9 one 3 5 2 2
10 one 2 3 3 4
11 one 3 2 1 3
12 one 2 5 2 1
13 one 2 4 1 4
14 one 2 2 1 3
15 one 1 3 1 4
16 one 1 6 1 3
17 one 3 4 2 2
18 one 3 2 1 3
19 one 2 5 3 1
20 one 3 6 2 1
21 two 1 6 3 4
22 two 1 2 1 2
23 two 3 2 1 2
24 two 1 2 2 1
25 two 2 3 1 3
26 two 1 5 3 3
27 two 2 4 1 4
28 two 2 6 2 4
29 two 1 6 2 2
30 two 1 5 1 4
31 two 2 1 2 4
32 two 3 6 1 1
33 two 1 1 3 1
34 two 2 4 2 3
35 two 2 1 3 2
36 two 2 3 1 3
37 two 1 2 3 4
38 two 3 5 2 2
39 two 2 1 3 4
40 two 2 1 1 3
Note that the scores have different ranges of values. Score 1 ranges from 1-3, score 2 from -6, score 3 from 1-3, score 4 from 1-4
I'm trying to reshape data like this:
library(reshape2)
dfMelt <- melt(df, id.vars="subject")
acast(dfMelt, subject ~ value ~ variable)
Aggregation function missing: defaulting to length
, , score1
1 2 3 4 5 6
one 6 7 7 0 0 0
two 8 9 3 0 0 0
, , score2
1 2 3 4 5 6
one 0 5 3 4 6 2
two 5 4 2 2 3 4
, , score3
1 2 3 4 5 6
one 10 7 3 0 0 0
two 8 6 6 0 0 0
, , score4
1 2 3 4 5 6
one 3 6 7 4 0 0
two 3 5 5 7 0 0
Note that the output array includes scores as "0" if they are missing. Is there any way to stop these missing scores being outputted by acast?
In this case, you might do better sticking to base R's table feature. I'm not sure that you can have an irregular array like you are looking for.
For example:
> lapply(df[-1], function(x) table(df[[1]], x))
$score1
x
1 2 3
one 9 6 5
two 11 4 5
$score2
x
1 2 3 4 5 6
one 2 5 4 3 3 3
two 4 2 2 3 4 5
$score3
x
1 2 3
one 9 5 6
two 4 11 5
$score4
x
1 2 3 4
one 4 4 8 4
two 2 6 5 7
Or, using your "long" data:
with(dfMelt, by(dfMelt, variable,
FUN = function(x) table(x[["subject"]], x[["value"]])))
Since each "score" subset is going to have a different shape, you will not be able to preserve the array structure. One option is to use lists of two-dim arrays or data.frames. eg:
# your original acast call
res <- acast(dfMelt, subject ~ value ~ variable)
# remove any columns that are all zero
apply(res, 3, function(x) x[, apply(x, 2, sum)!=0] )
Which gives:
$score1
1 2 3
one 7 8 5
two 6 8 6
$score2
1 2 3 4 5 6
one 4 2 6 4 1 3
two 2 5 3 4 3 3
$score3
1 2 3
one 5 10 5
two 5 11 4
$score4
1 2 3 4
one 5 4 4 7
two 4 6 6 4

Resources