Get indices of string occurrences in cell-array - arrays

I have a cell array that contains a long list of strings. Most of the strings are in duplicates. I need the indices of instances of a string within the cell array.
I tried the following:
[bool,ind] = ismember(string,var);
Which consistently returns scalar ind while there are clearly more than one index for which the contents in the cell array matches string.
How can I have a list of indices that points to the locations in the cell array that contains string?

As an alternative to Divakar's comment, you could use strcmp. This works even if some cell doesn't contain a string:
>> strcmp('aaa', {'aaa', 'bb', 'aaa', 'c', 25, [1 2 3]})
ans =
1 0 1 0 0 0

Alternatively, you can ID each string and thus have representative numeric arrays corresponding to the input cell array and string. For IDing, you can use unique and then use find as you would with numeric arrays. Here's how you can achieve that -
var_ext = [var string]
[~,~,idx] = unique(var_ext)
out = find(idx(1:end-1)==idx(end))
Breakdown of the code:
var_ext = [var string]: Concatenate everything (string and var) into a single cell array, with the string ending up at the end (last element) of it.
[~,~,idx] = unique(var_ext): ID everything in that concatenated cell array.
find(idx(1:end-1)==idx(end)): idx(1:end-1) represents the numeric IDs for the cell array elements and idx(end) would be the ID for the string. Compare these IDs and use find to pick up the matching indices to give us the final output.
Sample run -
Inputs:
var = {'er','meh','nop','meh','ya','meh'}
string = 'meh'
Output:
out =
2
4
6

regexp would solve this problem better and the easy way.
string = ['my' 'bat' 'my' 'ball' 'my' 'score']
expression = ['my']
regexp(string,expresssion)
ans = 1 6 12

Related

MATLAB cellfun() to map contains() to cell array

a={'hello','world','friends'};
I want to see if for every word in the cell array contains the letter 'o', how to use cellfun() to achieve the following in a compact expression?
b = [ contains(a(1),'o') contains(a(2),'o') contains(a(3),'o')]
You don't need cellfun, if you read the documentation, contains works natively on cell arrays of characters:
a = {'hello', 'world', 'friends'};
b = contains(a, 'o');
Which returns:
b =
1×3 logical array
1 1 0

show the values of string variables in a string array

I created an array of strings and I want to get the value of a given string at a given position, but the returned value is the character and not the string, eg:
myArray = ['string1' 'string2' 'string3'];
s = myArray(1); //returns the character at the position 1, instead of the string
How can I get the value of these strings based on a given position i ?
Try using a cell array:
myArray = {'string1' 'string2' 'string3'};
s = myArray{1};
You can do a for loop if this is what you are asking for.
myArray=['b' 'c' 'd']
for i =1:lenght(myArray)
s(i)=myArray(i);
end
Not sure what you are asking for exactly.

Finding an element of a structure based on a field value

I have a 1x10 structure array with plenty of fields and I would like to remove from the struct array the element with a specific value on one of the field variables.
I know the value im looking for and the field I should be looking for and I also know how to delete the element from the struct array once I find it. Question is how(if possible) to elegantly identify it without going through a brute force solution ie a for-loop that goes through elements of the struct array to compare with the value I m looking for.
Sample code: buyers as 1x10 struct array with fields:
id,n,Budget
and the variable to find in the id values like id_test = 12
You can use the fact that if you have an array of structs, and you use the dot referencing, this creates a comma-separated list. If you enclose this in [] it will attempt to create an array and if you enclose it in {} it will be coerced into a cell array.
a(1).value = 1;
a(2).value = 2;
a(3).value = 3;
% Into an array
[a.value]
% 1 2 3
% Into a cell array
{a.value}
% [1] [2] [3]
So to do your comparison, you can convert the field you care about into either an array of cell array to do the comparison. This comparison will then yield a logical array which you can use to index into the original structure.
For example
% Some example data
s = struct('id', {1, 2, 3}, 'n', {'a', 'b', 'c'}, 'Budget', {100, 200, 300});
% Remove all entries with id == 2
s = s([s.id] ~= 2);
% Remove entries that have an id of 2 or 3
s = s(~ismember([s.id], [2 3]));
% Find ones with an `n` of 'a' (uses a cell array since it's strings)
s = s(ismember({s.id}, 'a'));

How to find a substring in an array of strings in matlab?

I have a string 'ADSL'. I want to find this string in an array of strings char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL')
when i run this command
strmatch('ADSL',char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL'));
the output is 2
But I expect the output as [1 2]
strmatch only gives positive result if the search string appears at the begining of row.
How can I find the search string if it occurs anywhere in the row?
Given the following input:
array = {'PSTN,ADSL', 'ADSL,VDSL', 'FTTH,VDSL'};
str = 'ADSL';
We find the starting position of each string match using:
>> pos = strfind(array, str)
pos =
[6] [1] []
or
>> pos = regexp(array, str)
pos =
[6] [1] []
We can then find the indices of matching strings using:
>> matches = find(~cellfun(#isempty,pos))
matches =
1 2
For an array of strings, it's better to use a cell array. That way strings can be of differnet lengths (and regexp can be applied on all cells at once):
cellArray = {'PSTN,ADSL','ADSL,VDSL','FTTH,VDSL'};
str = 'ADSL';
Then:
result = find(~cellfun('isempty', regexp(cellArray, str)));
will give what you want.
If you really have a char array as in your example,
array = char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL');
you can convert to a cell array (with cellstr) and apply the above:
result = find(~cellfun('isempty', regexp(cellstr(array), str)));
i would use strfind
a=strfind(cellstr(char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL')),'ADSL');
in this case will be a three by one cell array containing the index where you string starts at in the corresponding string

Search for a cell array within another cell array and display the index

I have a string array
sub_str = {'SN1','SN2'};
main_str = {'SN3','SN2','SN1','SN4'};
i would expect output (the index) for the sub_str in main string is [3 2].
Is there a one liner to this?
Use second output argument from ismember -
Code
[~,ind] = ismember(sub_str,main_str)
Output
ind =
3 2
You can also use intersect -
[~,~,ind] = intersect(sub_str,main_str)

Resources