How to count characters or unique string in matlab [duplicate] - arrays

This question already has answers here:
how to count unique elements of a cell in matlab?
(2 answers)
Closed 8 years ago.
So how to count no. of repeated letters occur in certain array>
for example i have a array
a
a
a
c
b
c
c
d
a
how can i know how may a,b,c,and occur? i want an output like this:
Alphabet count
a 4
c 3
b 1
d 1
so how can i do that? thanks

arr = {'a' 'a' 'a' 'c' 'b' 'c' 'c' 'd' 'a'}
%// map letters with numbers and count them
count = hist(cellfun(#(x) x - 96,arr))
%// filter result and convert to cell
countCell = num2cell(count(find(count)).') %'
%// get sorted list of unique letters
letters = unique(arr).' %'
%// output
outpur = [letters countCell]
The solution in the duplicate answer is very neat, applied to your desired output:
[letters,~,subs] = unique(arr)
countCell = num2cell(accumarray(subs(:),1,[],#sum))
output = [letters.' countCell]
It appears to me, that your input array rather looks like:
arr = ['a'; 'a'; 'a'; 'c'; 'b'; 'c'; 'c'; 'd'; 'a']
so change the last line to:
output = [cellstr(letters) countCell]
output =
'a' [4]
'b' [1]
'c' [3]
'd' [1]

Related

MATLAB cellfun() to map contains() to cell array

a={'hello','world','friends'};
I want to see if for every word in the cell array contains the letter 'o', how to use cellfun() to achieve the following in a compact expression?
b = [ contains(a(1),'o') contains(a(2),'o') contains(a(3),'o')]
You don't need cellfun, if you read the documentation, contains works natively on cell arrays of characters:
a = {'hello', 'world', 'friends'};
b = contains(a, 'o');
Which returns:
b =
1×3 logical array
1 1 0

Matlab: Numerical array index into a string array (without loops)

I'm doing a set of problems from the MATLAB's introductory course at MIT OCW. You can see it here, it's problem number 9, part g.iii.
I have one matrix with the final grades of a course, all of them range from 1 to 5. And I have another array with only letters from 'F' to 'A' (in a 'decreasing' order).
I know how to change elements in a matrix, I suppose I could do something like this for each number:
totalGrades(find(totalGrades==1)) = 'F';
totalGrades(find(totalGrades==2)) = 'E';
totalGrades(find(totalGrades==3)) = 'C';
totalGrades(find(totalGrades==4)) = 'B';
totalGrades(find(totalGrades==5)) = 'A';
But then, what's the purpose of creating the string array "letters"?
I thought about using a loop, but we're supposed to solve the problem without one at that point of the course.
Is there a way? I'll be glad to know. Here's my code for the whole problem, but I got stuck in that last question.
load('classGrades.mat');
disp(namesAndGrades(1:5,1:8));
grades = namesAndGrades(1:15,2:size(namesAndGrades,2));
mean(grades);
meanGrades = nanmean(grades);
meanMatrix = ones(15,1)*meanGrades;
curvedGrades = 3.5*(grades./meanMatrix);
% Verifying
nanmean(curvedGrades)
mean(curvedGrades)
curvedGrades(curvedGrades>=5) = 5;
totalGrades = nanmean(curvedGrades,2);
letters = 'FDCBA';
Thanks a lot!
Try:
letters=['F','D','C','B','A'];
tg = [1 2 1 3 3 1];
letters(tg)
Result:
ans = FDFCCF
This works even when tg (total grade) is a matrix:
letters=['F','D','C','B','A'];
tg = [1 2 1 ; 3 3 1];
result = letters(tg);
result
result =
FDF
CCF
Edit (brief explanation):
It is easy to understand that when you do letters(2) you get the second element of letters (D).
But you can also select several elements from letters by giving it an array: letters([1 2]) will return the first and second elements (FD).
So, letters(indexesArray) will result in a new array that has the same length of indexesArray. But, this array has to contain numbers from 1 to the length of letters (or an error will pop up).

Matlab: convert int array into string array?

In Matlab I have integer array a=[1 2 3]. I need to convert them into one string, separated by ',':
c = '1,2,3'
If somehow I can have a string array b=['1' '2' '3'], then I can use
c = strjoin(b, ',')
to achieve the goal.
So my question is: How to convert integer array a=[1 2 3] into a string array b=['1' '2' '3']?
The int2str() is not working. It will give out
'1 2 3'
and it is not a "string array", so the strjoin can not apply to it to achieve '1,2,3'
You can simply use sprintf():
a = 1:3;
c = sprintf('%d,',a);
c = c(1:end-1);
There's a function in the file exchange called vec2str that'll do this.
You'll need to set the encloseFlag parameter to 0 to remove the square brackets. Example:
a = [1 2 3];
b = vec2str(a,[],[],0);
Inside b you'll have:
b =
'1,2,3'
I found one solution myself:
after getting the string (not array), split it:
b = int2str(); %b='1 2 3'
c = strsplit(b); %c='1' '2' '3'
Then I can get the result c=strjoin(c, ',') as I wanted.
You can use:
c = regexprep(num2str(a), '\s*', ',');

Matlab - How to compare values in a cell array?

I have a set of inputs and one output declared in a cell array like that:
A = {'a', 'f', 'c', 'b';
'b', 'f', 'c', 'a';
'a', 'f', 'b', 'c';
'c', 'f', 'b', 'a';
'c', 'f', 'a', 'b';
'b', 'f', 'a', 'c' }
where the first column is an output, and the rest are the inputs used, for each output.
I need to compare the values to reduce the calculation time.
So, the thing is, for equals outputs, I wanna know if the inputs are the same, a important remark.. the order of values desn't metter, so, when comparing f c b with f b c it is the same.
I need this because, acttualy, my data set is a 5040 x 7 cell array and I need to put them into a intorpolation function.
I thought in something like
if the value of the output column is equal to the another value of the same column, check if the value of inputs are all the same, using, ismember function.
But I can not arrive to a code that works.
Any help, please?
First, since you don't care about the order of the inputs, I would sort each of the rows:
[T, N] = size(A);
for t = 1:T
Asorted(t,1) = A(t,1);
Asorted(t,2:N) = sort(A(t,2:N));
end
Now you want to find all of the duplicate rows. A simple way to do this is first to convert to a character array, and use the unique function --
B = cell2mat(Asorted);
[C, ii, jj] = unique(B,'rows');
Now C contains the unique rows of B, ii contains the indexes of the unique rows, and jj labels each of the rows of B depending on which unique value it has.
If you wanted to filter out all of the duplicate rows from A, you can now do
Afiltered = A(ii, :);
This results in:
Afiltered =
'a' 'f' 'b' 'c'
'b' 'f' 'a' 'c'
'c' 'f' 'a' 'b'

How can I efficiently find unique cell arrays within a set of cell arrays in MATLAB?

I need to find only unique cell arrays within a set of cell arrays. For example, if this is my input:
I = {{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c'} ...
{'d' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'c' 'e'}};
Then I would want my output to look like this:
I_unique = {{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c'} ...
{'d' 'e'} ...
{'a' 'c' 'e'}};
Do you have any idea how to do this? The order of elements in the output doesn't matter, but efficiency does since the cell array I could be very large.
If your cells contain only sorted single characters then you can retain just the unique sequences using:
>> I = {{'a' 'b' 'c' 'd' 'e'} {'a' 'b' 'c'} {'d' 'e'} {'a' 'b' 'c' 'd' 'e'} {'a' 'b' 'c' 'd' 'e'} {'a' 'c' 'e'}};
>> I_unique = cellfun(#char, I, 'uniformoutput', 0);
>> I_unique = cellfun(#transpose, I_unique, 'uniformoutput', 0);
>> I_unique = unique(I_unique)
I_unique =
'abc' 'abcde' 'ace' 'de'
You can then split the resulting cells into single characters again:
>> I_unique = cellfun(#transpose, I_unique, 'uniformoutput', 0);
>> I_unique = cellfun(#cellstr, I_unique, 'uniformoutput', 0);
>> I_unique = cellfun(#transpose, I_unique, 'uniformoutput', 0);
>> I_unique{:}
ans =
'a' 'b' 'c'
ans =
'a' 'b' 'c' 'd' 'e'
ans =
'a' 'c' 'e'
ans =
'd' 'e'
EDIT: Updated to use a more efficient algorithm.
If efficiency is tantamount due to a large number of sets in I, then your best option is probably to roll your own optimized loops. This problem bears some similarity to a previous question about how to efficiently remove sets that are subsets of or equal to another. The difference here is that you are not concerned with removing subsets, just duplicates, so the code in my answer to the other question can be modified to further reduce the number of comparisons made.
First we can recognize that there's no point in comparing sets that have different numbers of elements, since they can't possibly match in that case. So, the first step is to count the number of strings in each set, then loop over each group of sets that have the same number of strings.
For each of these groups, we will have two nested loops: an outer loop over each set starting at the end of the sets, and an inner loop over every set preceding that one. If/When the first match is found, we can mark that set as "not unique" and break the inner loop to avoid extra comparisons. Starting the outer loop at the end of the sets gives us the added bonus that sets in I_unique will maintain the original order of appearance in I.
And here is the resulting code:
I = {{'a' 'b' 'c' 'd' 'e'} ... %# The sample cell array of cell arrays of
{'a' 'b' 'c'} ... %# strings from the question
{'d' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'c' 'e'}};
nSets = numel(I); %# The number of sets
nStrings = cellfun('prodofsize',I); %# The number of strings per set
uniqueIndex = true(1,nSets); %# A logical index of unique elements
for currentSize = unique(nStrings) %# Loop over each unique number of strings
subIndex = find(nStrings == currentSize); %# Get the subset of I with the
subSet = I(subIndex); %# given number of strings
for currentIndex = numel(subSet):-1:2 %# Outer loop
for compareIndex = 1:currentIndex-1 %# Inner loop
if isequal(subSet{currentIndex},subSet{compareIndex}) %# Check equality
uniqueIndex(subIndex(currentIndex)) = false; %# Mark as "not unique"
break %# Break the inner loop
end
end
end
end
I_unique = I(uniqueIndex); %# Get the unique values

Resources