How to find a substring in an array of strings in matlab? - arrays

I have a string 'ADSL'. I want to find this string in an array of strings char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL')
when i run this command
strmatch('ADSL',char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL'));
the output is 2
But I expect the output as [1 2]
strmatch only gives positive result if the search string appears at the begining of row.
How can I find the search string if it occurs anywhere in the row?

Given the following input:
array = {'PSTN,ADSL', 'ADSL,VDSL', 'FTTH,VDSL'};
str = 'ADSL';
We find the starting position of each string match using:
>> pos = strfind(array, str)
pos =
[6] [1] []
or
>> pos = regexp(array, str)
pos =
[6] [1] []
We can then find the indices of matching strings using:
>> matches = find(~cellfun(#isempty,pos))
matches =
1 2

For an array of strings, it's better to use a cell array. That way strings can be of differnet lengths (and regexp can be applied on all cells at once):
cellArray = {'PSTN,ADSL','ADSL,VDSL','FTTH,VDSL'};
str = 'ADSL';
Then:
result = find(~cellfun('isempty', regexp(cellArray, str)));
will give what you want.
If you really have a char array as in your example,
array = char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL');
you can convert to a cell array (with cellstr) and apply the above:
result = find(~cellfun('isempty', regexp(cellstr(array), str)));

i would use strfind
a=strfind(cellstr(char('PSTN,ADSL','ADSL,VDSL','FTTH,VDSL')),'ADSL');
in this case will be a three by one cell array containing the index where you string starts at in the corresponding string

Related

MATLAB Search Within a Cell Array of Cells

Setup:
I have a 21 x 3 cell array.
The first 2 columns are USUALLY strings or char arrays, but could be 1xn cells of strings or char arrays (if there are multiple alternate strings that mean the same thing in the context of my script). The 3rd element is a number.
I'm looking to return the index of any EXACT match of with a string or char array (but type doesn't have to match) contained in this cell array in column 1, and if column 1 doesn't match, then column 2.
I can use the following:
find(strcmp( 'example', celllist(:,1) ))
find(strcmp( 'example', celllist(:,2) ))
And these will match the corresponding indices with any strings / char arrays in the top level cell array. This won't, of course, match any strings that are inside of cells of strings inside the top level cell array.
Is there an elegant way to match those strings (that is, without using a for, while, or similar loop)? I want it to return the index of the main cell array (1 through 21) if the cells contains the match OR the cell within the cell contains an exact match in ANY of its cells.
The cellstr function is your friend, since it converts all of the following to a cell array of chars:
chars e.g. cellstr( 'abc' ) => {'abc'}
cells of chars e.g. cellstr( {'abc','def'} ) => {'abc','def'}
strings e.g. cellstr( "abc" ) => {'abc'}
string arrays e.g. cellstr( ["abc", "def"] ) => {'abc','def'}
Then you don't have to care about variable types, and can just do an ismember check on every element, which we can assume is a cell of chars.
We can set up a test:
testStr = 'example';
arr = { 'abc', 'def', {'example','ghi'}, "jkl", "example" };
% Expected output is [0,0,1,0,1]
Doing this with a loop to better understand the logic would look like this:
isMatch = false(1,numel(arr)); % initialise output
for ii = 1:numel(arr) % loop over main array
x = cellstr(arr{ii}); % convert to cellstr
isMatch(ii) = any( ismember( testStr, x ) ); % check if any sub-element is match
end
If you want to avoid loops* then you can do this one-liner instead using cellfun
isMatch = cellfun( #(x) any( ismember( testStr, cellstr(x) ) ), arr );
% >> isMatch = [0 0 1 0 1]
So for your case, you could run this on both columns and apply some simple logic to select the one you want
isMatchCol1 = cellfun( #(x) any( ismember( testStr, cellstr(x) ) ), arr(:,1) );
isMatchCol2 = cellfun( #(x) any( ismember( testStr, cellstr(x) ) ), arr(:,2) );
If you want the row index instead of a logical array, you can wrap the output with the find function, i.e. isMatchIdx = find(isMatch);.
*This only avoids loops visually, cellfun is basically a looping device in disguise, but it does save us initialising the output at least.

show the values of string variables in a string array

I created an array of strings and I want to get the value of a given string at a given position, but the returned value is the character and not the string, eg:
myArray = ['string1' 'string2' 'string3'];
s = myArray(1); //returns the character at the position 1, instead of the string
How can I get the value of these strings based on a given position i ?
Try using a cell array:
myArray = {'string1' 'string2' 'string3'};
s = myArray{1};
You can do a for loop if this is what you are asking for.
myArray=['b' 'c' 'd']
for i =1:lenght(myArray)
s(i)=myArray(i);
end
Not sure what you are asking for exactly.

Failing to use numeric characters within Ruby array as indices for a string

I am trying to use the numeric charecters from the array held within the positions argument as indices to access the characters of the string inside the string argument to subsequently print a new string. I have an idea of what I need to do to get it to work, but I am hung up.
Total code thus far:
def scramble_string(string, positions)
str = string
pos = positions.join
newstr = []
i = 0
while i < pos.length do
return newstr.push(str[pos[i]])
i += 1
end
end
scramble_string("hello", [2, 3, 4, 5])
I suspect my problem lies within this part of the code...
return newstr.push(str[pos[i]])
If I understand you, you can use the following to get a given substring of a string, using a range:
'this is a string'[5..8]
=> "is a"
A simple way would be:
str = 'this is a string'
positions = [2,3,6,9,10]
new_str = positions.map {|p| str[p]}.join
=> "iss s"
str = 'this is a string'
positions = [2,3,6,9,1]
str.split('').values_at(*positions).join
#=> "iss h"
Another way, one that does not use join:
positions.each_with_object('') { |i,s| s << str[i] }

Get indices of string occurrences in cell-array

I have a cell array that contains a long list of strings. Most of the strings are in duplicates. I need the indices of instances of a string within the cell array.
I tried the following:
[bool,ind] = ismember(string,var);
Which consistently returns scalar ind while there are clearly more than one index for which the contents in the cell array matches string.
How can I have a list of indices that points to the locations in the cell array that contains string?
As an alternative to Divakar's comment, you could use strcmp. This works even if some cell doesn't contain a string:
>> strcmp('aaa', {'aaa', 'bb', 'aaa', 'c', 25, [1 2 3]})
ans =
1 0 1 0 0 0
Alternatively, you can ID each string and thus have representative numeric arrays corresponding to the input cell array and string. For IDing, you can use unique and then use find as you would with numeric arrays. Here's how you can achieve that -
var_ext = [var string]
[~,~,idx] = unique(var_ext)
out = find(idx(1:end-1)==idx(end))
Breakdown of the code:
var_ext = [var string]: Concatenate everything (string and var) into a single cell array, with the string ending up at the end (last element) of it.
[~,~,idx] = unique(var_ext): ID everything in that concatenated cell array.
find(idx(1:end-1)==idx(end)): idx(1:end-1) represents the numeric IDs for the cell array elements and idx(end) would be the ID for the string. Compare these IDs and use find to pick up the matching indices to give us the final output.
Sample run -
Inputs:
var = {'er','meh','nop','meh','ya','meh'}
string = 'meh'
Output:
out =
2
4
6
regexp would solve this problem better and the easy way.
string = ['my' 'bat' 'my' 'ball' 'my' 'score']
expression = ['my']
regexp(string,expresssion)
ans = 1 6 12

Search for a cell array within another cell array and display the index

I have a string array
sub_str = {'SN1','SN2'};
main_str = {'SN3','SN2','SN1','SN4'};
i would expect output (the index) for the sub_str in main string is [3 2].
Is there a one liner to this?
Use second output argument from ismember -
Code
[~,ind] = ismember(sub_str,main_str)
Output
ind =
3 2
You can also use intersect -
[~,~,ind] = intersect(sub_str,main_str)

Resources