Matlab - How to compare values in a cell array? - arrays

I have a set of inputs and one output declared in a cell array like that:
A = {'a', 'f', 'c', 'b';
'b', 'f', 'c', 'a';
'a', 'f', 'b', 'c';
'c', 'f', 'b', 'a';
'c', 'f', 'a', 'b';
'b', 'f', 'a', 'c' }
where the first column is an output, and the rest are the inputs used, for each output.
I need to compare the values to reduce the calculation time.
So, the thing is, for equals outputs, I wanna know if the inputs are the same, a important remark.. the order of values desn't metter, so, when comparing f c b with f b c it is the same.
I need this because, acttualy, my data set is a 5040 x 7 cell array and I need to put them into a intorpolation function.
I thought in something like
if the value of the output column is equal to the another value of the same column, check if the value of inputs are all the same, using, ismember function.
But I can not arrive to a code that works.
Any help, please?

First, since you don't care about the order of the inputs, I would sort each of the rows:
[T, N] = size(A);
for t = 1:T
Asorted(t,1) = A(t,1);
Asorted(t,2:N) = sort(A(t,2:N));
end
Now you want to find all of the duplicate rows. A simple way to do this is first to convert to a character array, and use the unique function --
B = cell2mat(Asorted);
[C, ii, jj] = unique(B,'rows');
Now C contains the unique rows of B, ii contains the indexes of the unique rows, and jj labels each of the rows of B depending on which unique value it has.
If you wanted to filter out all of the duplicate rows from A, you can now do
Afiltered = A(ii, :);
This results in:
Afiltered =
'a' 'f' 'b' 'c'
'b' 'f' 'a' 'c'
'c' 'f' 'a' 'b'

Related

Remove part of a JSON ArrayBuffer

Firstly, Play JSON is not an option unfortunately.
I know how to remove & add elements in an ArrayBuffer, e.g. the below removes elements "b" & "c"...
val x = ArrayBuffer('a', 'b', 'c', 'd', 'e')
x -= ('b', 'c')
Giving
x = ArrayBuffer('a', 'd', 'e')
However, I have a JSON Array, something like below, where I want to remove "c" & "d" that exists in each dictionary within the list of dictionaries of "z".
ArrayBuffer({"x":1,"y":2,"z":[{"a":0.5,"b":"North","c":[{"c1":1,"c2":195.00,"c3":null},{"c1":2,"c2":229.00,"c3":null}],"d":{"d1":"N","d2":null}},{"a":0.5,"b":"North","c":[{"c1":1,"c2":195.00,"c3":null},{"c1":2,"c2":229.00,"c3":null}],"d":{"d1":"N","d2":null}},{"a":0.5,"b":"North","c":[{"c1":1,"c2":195.00,"c3":null},{"c1":2,"c2":229.00,"c3":null}],"d":{"d1":"N","d2":null}}]
Is this possible?
If so could you point me in the right direction - would be very grateful!
Thank you

Best way to generate all combinations in array that contain certain element in it

I know that I can easily get all the combinations, but is there a way to only get the ones that contain certain element of the list? I'll give an example.
Lets say I have
arr = ['a','b','c','d']
I want to get all combinations with length (n) containing 'a', for example, if n = 3:
[a, b, c]
[a, b, d]
[a, c, d]
I want to know if there is a better way to get it without generating all combinations. Any help would be appreciated.
I would proceed as follow:
Remove 'a' from the array
Generate all combinations of 2 elements from the reduced array
For each combination, insert the 'a' in all three possible places
You can use combination of itertools and list comprehension. Like:
import itertools
import itertools
arr = ['a', 'b', 'c', 'd']
temp = itertools.combinations(arr, 3)
result = [list(i) for i in list(temp) if 'a' in i]
print(result)
output:
[['a', 'b', 'c'], ['a', 'b', 'd'], ['a', 'c', 'd']]

Depth First Search in C

If I have to iterate through a table of hexagonal cells checking for text inside them by conducting a recursive depth first search, arranged as shown: [Typing it out on StackOverflow apparently doesn't keep the formatting.]
Example 1:
Example 2:
What would be the best way to identifying them as "cells?" In other words, besides removing the textual diagonal lines and converting them into a 2D array with just numbers in it, what would be the best way to tell the computer in code to recognize x certain number of y characters resembles a "cell?"
Thanks in advance.
Easiest way to represent a hexagonal grid would be plain 2-d array with special rule about neighborhood of the cells. Take your second case for example, in matrix form it would be:
char M[][] =
{
{ 'b', 'g', 'g', 'b', ' ' },
{ 'g', ' ', 'B', 'B', 'B' },
{ 'g', 'B', ' ', 'b', 'g' },
{ 'B', ' ', 'g', 'g', 'g' }
}
Element in column m in row n is neighbor with:
elements in columns m and m + 1 in row n - 1
elements in columns m - 1 and m + 1 in row n
elements in columns m - 1 and m in row n + 1

How to count characters or unique string in matlab [duplicate]

This question already has answers here:
how to count unique elements of a cell in matlab?
(2 answers)
Closed 8 years ago.
So how to count no. of repeated letters occur in certain array>
for example i have a array
a
a
a
c
b
c
c
d
a
how can i know how may a,b,c,and occur? i want an output like this:
Alphabet count
a 4
c 3
b 1
d 1
so how can i do that? thanks
arr = {'a' 'a' 'a' 'c' 'b' 'c' 'c' 'd' 'a'}
%// map letters with numbers and count them
count = hist(cellfun(#(x) x - 96,arr))
%// filter result and convert to cell
countCell = num2cell(count(find(count)).') %'
%// get sorted list of unique letters
letters = unique(arr).' %'
%// output
outpur = [letters countCell]
The solution in the duplicate answer is very neat, applied to your desired output:
[letters,~,subs] = unique(arr)
countCell = num2cell(accumarray(subs(:),1,[],#sum))
output = [letters.' countCell]
It appears to me, that your input array rather looks like:
arr = ['a'; 'a'; 'a'; 'c'; 'b'; 'c'; 'c'; 'd'; 'a']
so change the last line to:
output = [cellstr(letters) countCell]
output =
'a' [4]
'b' [1]
'c' [3]
'd' [1]

How can I efficiently find unique cell arrays within a set of cell arrays in MATLAB?

I need to find only unique cell arrays within a set of cell arrays. For example, if this is my input:
I = {{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c'} ...
{'d' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'c' 'e'}};
Then I would want my output to look like this:
I_unique = {{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c'} ...
{'d' 'e'} ...
{'a' 'c' 'e'}};
Do you have any idea how to do this? The order of elements in the output doesn't matter, but efficiency does since the cell array I could be very large.
If your cells contain only sorted single characters then you can retain just the unique sequences using:
>> I = {{'a' 'b' 'c' 'd' 'e'} {'a' 'b' 'c'} {'d' 'e'} {'a' 'b' 'c' 'd' 'e'} {'a' 'b' 'c' 'd' 'e'} {'a' 'c' 'e'}};
>> I_unique = cellfun(#char, I, 'uniformoutput', 0);
>> I_unique = cellfun(#transpose, I_unique, 'uniformoutput', 0);
>> I_unique = unique(I_unique)
I_unique =
'abc' 'abcde' 'ace' 'de'
You can then split the resulting cells into single characters again:
>> I_unique = cellfun(#transpose, I_unique, 'uniformoutput', 0);
>> I_unique = cellfun(#cellstr, I_unique, 'uniformoutput', 0);
>> I_unique = cellfun(#transpose, I_unique, 'uniformoutput', 0);
>> I_unique{:}
ans =
'a' 'b' 'c'
ans =
'a' 'b' 'c' 'd' 'e'
ans =
'a' 'c' 'e'
ans =
'd' 'e'
EDIT: Updated to use a more efficient algorithm.
If efficiency is tantamount due to a large number of sets in I, then your best option is probably to roll your own optimized loops. This problem bears some similarity to a previous question about how to efficiently remove sets that are subsets of or equal to another. The difference here is that you are not concerned with removing subsets, just duplicates, so the code in my answer to the other question can be modified to further reduce the number of comparisons made.
First we can recognize that there's no point in comparing sets that have different numbers of elements, since they can't possibly match in that case. So, the first step is to count the number of strings in each set, then loop over each group of sets that have the same number of strings.
For each of these groups, we will have two nested loops: an outer loop over each set starting at the end of the sets, and an inner loop over every set preceding that one. If/When the first match is found, we can mark that set as "not unique" and break the inner loop to avoid extra comparisons. Starting the outer loop at the end of the sets gives us the added bonus that sets in I_unique will maintain the original order of appearance in I.
And here is the resulting code:
I = {{'a' 'b' 'c' 'd' 'e'} ... %# The sample cell array of cell arrays of
{'a' 'b' 'c'} ... %# strings from the question
{'d' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'b' 'c' 'd' 'e'} ...
{'a' 'c' 'e'}};
nSets = numel(I); %# The number of sets
nStrings = cellfun('prodofsize',I); %# The number of strings per set
uniqueIndex = true(1,nSets); %# A logical index of unique elements
for currentSize = unique(nStrings) %# Loop over each unique number of strings
subIndex = find(nStrings == currentSize); %# Get the subset of I with the
subSet = I(subIndex); %# given number of strings
for currentIndex = numel(subSet):-1:2 %# Outer loop
for compareIndex = 1:currentIndex-1 %# Inner loop
if isequal(subSet{currentIndex},subSet{compareIndex}) %# Check equality
uniqueIndex(subIndex(currentIndex)) = false; %# Mark as "not unique"
break %# Break the inner loop
end
end
end
end
I_unique = I(uniqueIndex); %# Get the unique values

Resources