How to split sentences in an array - arrays

I have a string s which stores a very long sentence and I want to copy the content of s to an array C with each cell storing a sentence each. The following is my code which is not giving me any output, but the dimension of the cell:
while(i<6)
C(i)=s;
end
This is how I get as output when I print C:
C=
[1x76 char]
Can somebody please help me.

Another job for strsplit:
>> sentences = 'This is the first one. Then here is a second. Yet another here.';
>> C = strsplit(sentences,'. ')
C =
'This is the first one' 'Then here is a second' 'Yet another here.'
We are specifying a period followed by a space as the delimiter. Change this as needed.

Suppose Long string is:
longString = "This is first cell. This is second cell. this is third cell".
Now since . is delimiter here means it is acting as separator for sentences. so you can loop through longString character wise and whenever you encounter a . you just increase Array index count and keep storing in this Array index until you find another .
here is sudo code:
array[];
index = 0;
loop through(longString) character wise
{
if(currentChar equals to '.')
{
index++;
}
else
{
array[index] = currentChanracter;
}
}

Related

storing the longest string after strsplit

I am trying to store the longest resultant string after using the function strsplit unable to do so
eg: I have input strings such as
'R.DQDEGNFRRFPTNAVSMSADENSPFDLSNEDGAVYQRD.L'or
'L.TSNKDEEQRELLKAISNLLD'
I need store the string only between the dots (.)
If there is no dot then I want the entire string.
Each string may have zero, one or two dots.
part of the code which I am using:
for i=1:700
x=regexprep(txt(i,1), '\([^\(\)]*\)','');
y=(strsplit(char(x),'.'));
for j=1:3
yValues(1,j)=y{1,j};
end
end
But the string yValues is not storing the value of y, instead showing the following error:
Assignment has more non-singleton rhs dimensions than non-singleton subscripts
What am I doing wrong and are there any suggestions on how to fix it?
The issue is that y is a cell array and each element contains an entire string and it therefore can't be assigned to a single element in a normal array yvalues(1,j).
You need yvalues to be a cell array and then you can assign into it just fine.
yValues{j} = y{j};
Or more simply
% Outside of your loop
yValues = cell(1,3);
% Then inside of your loop
yValues(j) = y(j);
Alternately, if you just want the longest output of strsplit, you can just do something like this.
% Split the string
parts = strsplit(mystring, '.');
% Find the length of each piece and figure out which piece was the longest
[~, ind] = max(cellfun(#numel, parts));
% Grab just the longest part
longest = parts{ind};

Error when copying a word to array character by character

I'm trying to copy an unknown length of characters into an array, but I keep getting an error. I'm getting this from a website converted to text. Site is the position of the first character of the word (I want to copy 4 words), and result is the whole text file.
I keep getting this error:
Subscript indices must either be real positive integers or logicals.
for this line: webget = result(sites(i)+n);
for i = 0:3; %for finding first 4
webget = 'p'; %placeholder
website = []; %blank
while strcmp(webget,' ') == 0;
for n = 0:150; %letter by letter, arbitrary search length
webget = result(sites(i)+n);
website = strcat(website,webget);
end
end
website(i) = website;
end
Could anyone help?
Matlab arrays index starting from 1, not 0. On your first loop iteration, i=0, so your request for the 0th entry in the sites array is not valid.
Consider using i = 1:4.

String handling in matlab

I am working on this code which is to copy the sentences stored in one array to another.
'text1' is the array which stores all my sentences and C1 is the array into which the sentences have to be copied.
text1 is a 1x8 array with text1(1,1) containing the first sentence, text1(1,2) with the second sentence and so on.
The following is the code that I have written to copy the contents from text1 to C1:
for i=1:vr
if(Track(i)<0)
text1{1,i};
C1(1,j)=text1(1,i)
j=j+1;
end
end
Can somebody help me?Thanks in advance.
Since you haven't included examples of neither text1, Track nor vr I can't test anything. But your assigning C1 incorrectly if its a cell array. Use C1{1,i} = text{1,i}instead.
But, if you want to copy everything in text1 into a new cell array, with exactly the same contents C1 = text1; will do that.
If Track is an array, you should be able to do it follows (using logical indexing):
C1 = text1(Track < 0);
Or something similar to that, depending on exact structure of your data.
Have you initialized the cell C1 and j?
j = 1;
C1 = {};
for i=1:length(text1)
if( Track(i)<0 )
text1{1,i};
C1(1,j)=text1(1,i)
j=j+1;
end
end

matlab divide sentences into words

Im new in matlab and ım trying to take the input from matlab gui which will be entered by a user and divide that sentence into words but I need to have them as letters because Im using a robot to write them. this letters will be send to these robots. Im using two robots and for example when I write 'lou reed' in text when ı press the button matlab function will hold this 2 words in to different char arrays so that ı can have the letters c(i) like this and send them to process. so far ı wrote these but ım stuck.
c = char(get(handles.edit1,'String'));
int count1;
int count2;
char word1;
char space=" ";
for i=1:length(c)
int t = isequal(c(i),space);
if(t==0)
count1=count1+1;
word1=;%ım trying to add the char here to find the new word
else
end
end
ı dont know what to do ı searched but ı couldnt find something usefull maybe ı wasnt looking right.
Anything would be helpful, thankss
What characters are allowed? First you should remove all the characters that are not allowed (substitute them with a space character?). After that just this:
str = ' Once upon a time ';
words_in_str = textscan(str,'%s');
words_in_str{1}
If you have a newer version of MATLAB (greater than 2012a I think), you can use strsplit
characterString = 'lou reed';
C = strsplit(characterString);
C will be a cell array with each element being a separate word.
You can simply find the space characters in your string with
mystring = 'Hello Cruel World';
spaces = find(mystring==' ');
The variable spaces is now a vector pointing to where each of your word breaks are. If you want to break this up into words, you could use
mystring = 'Hello Cruel World';
wordboundaries = [0,find(mystring==' ')];
wordlen = diff([wordboundaries,length(mystring)+1])-1;
numwords = length(wordboundaries);
for w = 1:numwords
idx = wordboundaries(w) + (1:wordlen(w));
word{w} = mystring(idx);
end
display(word);
Now word is a cell array containing the individual words.

Comparison of two array elements and calculation

I have an issue with a section of code I wish to write. My problem is based around two arrays and the elements they encompass.
I have two arrays filled with numbers (relating to positions in a string). I wish to select the substrings between the positions. The elements in the first array are the start of the substrings and the elements in the second array are the ends of the substrings.
The code I have supplied reads in the file and makes it a string:
>demo_data
theoemijono
milotedjonoted
dademimamted
String:
theoemijonomilotedjonoteddademimamted
so what I want to happen is to extract the substring
emijonomiloted
emimamted
The code I have written takes the the first element array and compares it with the second array corresponding element and then to ensure that there is no cross over and hence hold the substring to start with emi and end with tedas seen in the provided sequences
for($i=0; $i<=10; $i++)
{
if ($rs1_array[$i] < $rs2_array[$i] && $rs1_array[$i+1] > $rs2_array[$i])
{
my$size= $rs2_array[$i]-$rs1_array[$i]+ 3);
my$substr= substr($seq, $rs1_array[$i],$size);
print $substr."\n";
}
}
Using this code works for the first substring, but the second substring is ignored as the first array has fewer elements and hence the comparison cannot be completed.
UPDATE
Array structures:
#rs1_array = (4, 28);
#rs2_array = (15, 22, 34);
Hi borodin, You were absolutely correct.. I have edited the code now! Thank you for seeing that in relation to the length issue. The reason for the strange offset is that the value in #rs2_array is the start position and it does not take into consideration the remainder of the word "ted" in this case and I require this to complete the string.The Array is built correctly as for the elements in #rs1_array they represent the start position "emi" the #rs2_array elements also hold the start position for each "ted" so as there are 2 emi's and 3 ted's in the string this causes the unbalance.
my #starts = ( 4, 28 );
my #ends = map $_+3, ( 15, 22, 34 );
my $starts_idx = my $ends_idx = 0;
while ($starts_idx < #starts && $ends_idx < #ends) {
if ($starts[$start_idx] > $ends[$ends_idx]) {
++$start_idx;
next;
}
my $length = $ends[$ends_idx] - $starts[$start_idx];
say substr($seq, $starts[$start_idx], $length);
++$ends_idx;
++$start_idx;
}
Which, of course, gives the same output as:
say for $seq =~ /(emi(?:(?!emi|ted).)*ted)/sxg;

Resources