I am trying to find the number of occurrences of "the" in a file that I have read into MATLAB. I have the following code n=strfind(z,'the') where z is the cell that all my lines are stored into. It finds all the occurrences but I am unsure how to sum them up to get a number. I tried using sum but it doesn't work. Any help would be greatly appreciated.
strfind will return [] if the supplied string is not found.
cell2mat will remove empty values from a cell array and just return the indices of the found string.
Therefore, you just need the length of the returned vector
z = {'Testing','Another','the', 'And the'};
n=length(cell2mat(strfind(z,'the')))
n =
3
Consider using cellfun to operate on the output of strfind so that you can use sum as you would like to do:
sum(cellfun(#numel,strfind(z,'the')))
Related
I have a for loop and each value a{i} b{i} c{i} is equal each time with a specific number. So I was wondering how can I put all those value in an array through loop. The way that I am using I mean this one [a{i};b{i};c{i}] it seems that it doesn't work! If I keep 2 out of three values is working but I want the data from all of the values (a b c)
You can see the (pseudo)code below:
for i=1:number of cells
Cell{i}.Tri=[a{i};b{i};c{i}]
end
cell2mat is what you need:
a = num2cell(rand(1,10));
b = num2cell(rand(1,10));
c = num2cell(rand(1,10));
abc = cell2mat([a;b;c]);
This can be done without a for loop by using cellfun combined with the cat function. EDIT: As noted in the comments, cellfun is itself a loop.
% Create all variables
a{1}=rand(10);
a=repmat(a,10,1);
b=a;
c=a;
% Add a cell array of equal size to a. The contents of each cell are the dimension along which to concatenate.
catarg=num2cell(ones(size(a)))
% Do the concatenation
d=cellfun(#cat,catarg,a,b,c,'UniformOutput',false);
I have a query which I am trying to solve
I know that one can use strcmp(s1,s2) to compare two different strings to see whether they are the same. It gives 1 if that is the case.
However, how would one tackle this problem if you have a variable length array full of strings and you want to the whether all strings in the array are the same.
For example: ['NACA64A010' 'NACA64A010' 'NACA64A010' 'NACA64A010'] we can see that all the strings are the same in this array. However, how would one go about with using strcmp(s1,s2).
Thanks guys!
If you want all pairwise comparisons between strings: call ndgrid to generate indices of all combinations, and then index into your cell array of strings and call strcmp:
x = {'NACA64A010' 'NACA64A010' 'NACA64A010' 'NACA64A010'};
[ii, jj] = ndgrid(1:numel(x));
result = strcmp(x(ii), x(jj));
In this case
result =
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
because all strings are the same.
You probably had a pairwise comparison using strcmp in mind, but you can use it directly on cell arrays:
x={'NACA64A010' 'NACA64A010' 'NACA64A010' 'NACA64A010'}
result=all(strcmpi(x{1},x(2:end)))
Compare the first element to the remaining elements. It returns true only if all elements are equal. For a pairwise comparison you could us:
[~,~,c]=unique(x);
result=bsxfun(#eq,c,c.')
If you're solving the problem with a matrix (i.e. every row is a string) there are no particularly nice solutions in my opinion, but if your strings are contained into a cell array, things are getting easier and nicer.
So we start by creating such cell array:
myStrings={'NACA64A010' 'NACA64A010' 'NACA64A010' 'NACA64A010'};
where each cell contains a string. This will make your code more robust as well since every string can have a different length (this is not true if you concatenate all your strings in a matrix).
Then you specify which string you want to find inside such cell array:
stringThatMustBeTested='NACA64A010';
Now you can use cellfun(), which is a function that applies another function to every cell of a given cell array as follows:
results=cellfun(#(x) strcmp(x,stringThatMustBeTested),myStrings);
Such line simply means "apply strcmp() to every generic cell x inside myStrings and compare the cell with stringThatMustBeTested".
Variable results will be a logical output in which element j will be true if the j-th cell in your cell array is equal to the string you want to test. If results is entirely composed by 1s (which you can check as if sum(results)==length(results)), then all the strings are the same in myStrings (given that stringThatMustBeTested is the unique string in your cell array but anyways, this solution can be extended to a broader string search inside a cell).
This is my first time posting so i hope you can help me. I am trying to write a function in matlab.
I have laded data from a file into a cell array. First column contains statements and the second contains T for true og F for false. I now want to split this array into a cell array with the statements and a logical vector with 1 for True and -1 for false.
I use the fgetl within a loop to read all the lines into the cellarray
Try to write it a bit more neatly next time, and consider including a small example.
Here is what you seem to be looking for:
Suppose you have a matrix M and want to split that into M_true and M_false
M = {1,'T';
22,'F';
333,'T'}
idx_T=strcmp(M(:,2),'T')
M_true = M(idx_T,1)
M_false = M(~idx_T,1)
I have a cell array, c, filled with hexadecimal data and when I view the cell contents by typing c at the matlab prompt, it shows me contents enclosed between ticks, i.e., '0x0009'. But, one element is enclosed in brackets and looks like [650345]. How can I convert the [ ] data to ' ' data? When I do iscellstr on this particular element, matlab returns 0. iscellstr returns 1 for all other elements of c.
I'm reading this data into matlab from excel and I fear that excel 'helped' me by converting one hex value to scientific notation. I can't, as far as I've found, change what excel did. I think the true value is lost and unrecoverable. But I need to convert this one outstanding value, even if incorrect, to be like the other cell values so that I can carry on with my processing. Any suggestions?
If you know the index of wrong value and it's true value, you just do:
c(idx) = {'0x0009'};
I think this does what you want:
ind = cellfun(#isnumeric, c); %// find numeric cells
c(ind) = cellfun(#(s) ['0x' dec2hex(s)], c(ind), 'uniformout', 0); %// convert to
%// hex string and prepend '0x'
Example: input
c = {'0x0009', 650345};
produces the output
c =
'0x0009' '0x9EC69'
I want to check if any string in an array of strings is a prefix of any other string in the same array. I'm thinking radix sort, then single pass through the array.
Anyone have a better idea?
I think, radix sort can be modified to retrieve prefices on the fly. All we have to do is to sort lines by their first letter, storing their copies with no first letter in each cell. Then if the cell contains empty line, this line corresponds to a prefix. And if the cell contains only one entry, then of course there are no possible lines-prefices in it.
Here, this might be cleaner, than my english:
lines = [
"qwerty",
"qwe",
"asddsa",
"zxcvb",
"zxcvbn",
"zxcvbnm"
]
line_lines = [(line, line) for line in lines]
def find_sub(line_lines):
cells = [ [] for i in range(26)]
for (ine, line) in line_lines:
if ine == "":
print line
else:
index = ord(ine[0]) - ord('a')
cells[index] += [( ine[1:], line )]
for cell in cells:
if len(cell) > 1:
find_sub( cell )
find_sub(line_lines)
If you sort them, you only need to check each string if it is a prefix of the next.
To achieve a time complexity close to O(N2): compute hash values for each string.
Come up with a good hash function that looks something like:
A mapping from [a-z]->[1,26]
A modulo operation(use a large prime) to prevent overflow of integer
So something like "ab" gets computed as "12"=1*27+ 2=29
A point to note:
Be careful what base you compute the hash value on.For example if you take a base less than 27 you can have two strings giving the same hash value, and we don't want that.
Steps:
Compute hash value for each string
Compare hash values of current string with other strings:I'll let you figure out how you would do that comparison.Once two strings match, you are still not sure if it is really a prefix(due to the modulo operation that we did) so do a extra check to see if they are prefixes.
Report answer