matlab: logically comparing two cell arrays - arrays

I have an excel file from which I obtained two string arrays, Titles of dimension 6264x1 and another Names of dimension 45696x1. I want to create an output matrix of size 6264x45696 containing in the elements a 1 or 0, a 1 if Titles contains Names.
I think I want something along the lines of:
for (j in Names)
for (k in Titles)
if (Names[j] is in Titles[k])
write to excel
end
end
end
But I don't know what functions I should use to achieve what I have in the picture. Here is what I have come up with:
[~,Title] = xlsread('exp1.xlsx',1,'A3:A6266','basic');
[~,Name] = xlsread('exp1.xlsx',2,'B3:B45698','basic');
A = cellstr(Title);
GN = cellstr(Name);
BinaryMatrix = false(45696,6264);
for i=1:1:45696
for j=1:1:6264
if (~isempty(ismember(A,GN)))
BinaryMatrix(i,j)= true;
end
end
end
the problem with this code is that it never finishes running, although there are no suggestions within matlab.

You can use third output of unique to get numbers corresponding to each string element and use bsxfun to compare numbers.
GN = cellstr(Name);
A = cellstr(Title);
B = [ GN(:); A(:)];
[~,~,u]= unique(B);
BinaryaMatrix = bsxfun(#eq, u(1:numel(GN)),u(numel(GN)+1:end).');

ismember can handle cell arrays of character vectors. Its second output tells you the information you need, from which you can build the result using sparse (it could also be done by preallocating and using [sub2ind):
[~, m] = ismember(Titles, Names);
BinaryMatrix = full(sparse(nonzeros(m), find(m), true, numel(Names), numel(Titles)));

Related

Array intersection issue (Matlab)

I am trying to carry out the intersection of two arrays in Matlab but I cannot find the way.
The arrays that I want to intersect are:
and
I have tried:[dur, itimes, inewtimes ] = intersect(array2,char(array1));
but no luck.
However, if I try to intersect array1 with array3 (see array3 below), [dur, itimes, inewtimes ] = intersect(array3,char(array1));the intersection is performed without any error.
Why I cannot intersect array1 with array2?, how could I do it?. Thank you.
Just for ease of reading, your formats for Arrays are different, and you want to make them the same. There are many options for you, like #Visser suggested, you could convert the date/time into a long int which allows faster computation, or you can keep them as strings, or even convert them into characters (like what you have done with char(Array2)).
This is my example:
A = {'00:00:00';'00:01:01'} %//Type is Cell String
Z = ['00:00:00';'00:01:01'] %//Type is Cell Char
Q = {{'00:00:00'};{'00:01:01'}} %//Type is a Cell of Cells
A = cellstr(A) %//Convert CellStr to CellStr is essentially doing nothing
Z = cellstr(Z) %//Convert CellChar to CellStr
Q = vertcat(Q{:,:}) %// Convert Cell of Cells to Cell of Strings
I = intersect (A,Z)
>>'00:00:00'
'00:01:01'
II = intersect (A,Q)
>>'00:00:00'
'00:01:01'
This keeps your dates in the format of Strings in case you want to export them back into a txt/csv file.
Your first array would look something like this:
array1 = linspace(0,1,86400); % creates 86400 seconds in 1 day
Your second array should be converted using datenum, then use cell2mat to make it a matrix. Lastly, use ismember to find the intersection:
InterSect = ismember(array2,array1);

Multiply elements in second column according to labels in the first

I'm working in Matlab.
I have a two-dimensional matrix with two columns. Lets consider elements in the first column as labels. Labels may be repeated.
How to multiply all elements in the second column for every label?
Example:
matrix = [1,3,3,1,5; 2,3,7,8,3]'
I need to get:
a = [1,3,5; 16,21,3]'
Can you help me with doing it without for-while cycles?
I would use accumarray. The preprocessing with unique assigns integer indices 1:n to the values in the first row, which allow accumarray to work without creating unnecessary bins for 2 and 4. It also enables the support for negative numbers and floats.
[ulable,~,uindex]=unique(matrix(:,1))
r=accumarray(uindex,matrix(:,2),[],#prod)
r=[ulable,r]
/You can also use splitapply:
[ulable,~,uindex]=unique(matrix(:,1))
r=splitapply(#prod,matrix(:,2),uindex)
r=[ulable,r]
You can do it without loops using accumarray and the prod function:
clear
clc
matrix = [1,3,3,1,5; 2,3,7,8,3]';
A = unique(matrix,'rows');
group = A(:,1);
data = A(:,2);
indices = [group ones(size(group))];
prods = accumarray(indices, data,[],#prod); %// As mentionned by #Daniel. My previous answer had a function handle but there is no need for that here since prod is already defined in Matlab.
a = nonzeros(prods)
Out = [unique(group) a]
Out =
1 16
3 21
5 3
Check Lauren blog's post here, accumarray is quite interesting and powerful!
Try something like this, I'm sure it can be improved...
unValues = unique(matrix(:,1));
bb = ones(size(unValues));
for ii = 1:length(unValues)
bb(ii) = bb(ii)*prod(matrix(matrix(:, 1) == unValues(ii), 2));
end
a = [unValues bb];

How to name variables in a data array using a for loop

I have an array within an array and I am trying to name the variables using a for loop as there are a lot of variables. When I use the following simple code Time1 = dataCOMB{1,1}{1,1}(1:1024, 1); it opens the first cell in an array and proceeds to open the first cell in the following array and finally defines all the values in column 1 rows 1 to 1024 as Time1. However I have 38 of these different sets of data and when I apply the following code:
for t = 1:38
for aa = 1:38
Time(t) = dataCOMB{1,1}{1,aa}(1:1024, 1);
end
end
I get an error
In an assignment A(I) = B, the number of elements in B and I must be the same.
Error in Load_Files_working (line 39)
Time(t) = dataCOMB{1,1}{1,aa}(1:1024, 1);
Basically I am trying to get matlab to call the first column in each data set Time1, Time2, etc.
The problem:
1)You'd want to extract in a cell row...
2) ...the first 1024 numbers in the 1st column...
3) ...from each of the first 38 cells of a cell array.
The plan:
1) If one wants to get info from each element of a cell array (that is, an array accessed via {} indexing), one may use cellfun. Calling cellfun(some_function, a_cell_array) will aggregate the results of some_function(a_cell_array{k}) for all possible k subscripts. If the results are heterogeneous (i.e. not having the same type and size), one may use the cell_fun(..., 'UniformOutput', false) option to put them in an output cell array (cell arrays are good at grouping together heterogeneous data).
2) To extract the first 1024 numbers from the first column of an numeric array x one may use this anonymous function: #(x) x(1:1024,1). The x argument will com from each element of a cell array, and our anonymous function will play the role of some_function in the step above.
3) Now we need to specify a_cell_array, i.e. the cell array that contains the first 38 cells of the target. That would be, simply dataCOMB{1,1}(1,1:38).
The solution:
This one-liner implements the plan:
Time = cellfun(#(x) x(1:1024,1), dataCOMB{1,1}(1,1:38), 'UniformOutput', false);
Then you can access your data as in this example:
this_time = Time{3};
Your error is with Time(t). That's not how you create a new variable in matlab. To do exactly what you want (ie, create variables names Time1, Time2, etc...you'll need to use the eval function:
for aa = 1:38
eval(['Time' num2str(aa) '= dataCOMB{1,1}{1,aa}(1:1024,1);']);
end
Many people do not like recommending the eval function. Others wouldn't recommend moving all of your data out of a data structure and into their own independently-named variables. So, to address these two criticisms, a better alternative might be to pull your data out of your complicated data structure and to put it into a simpler array:
Time_Array = zeros(1024,38);
for aa = 1:38
Time_Array(:,aa) = dataCOMB{1,1}{1,aa}(1:1024,1);
end
Or, if you don't like that because you really like the names Time1, Time2, etc, you could create them as fields to a data structure:
Time_Data = [];
for aa = 1:38
fieldname = ['Time' num2str(aa)];
Time_Data.(fieldname) = dataCOMB{1,1}{1,aa}(1:1024,1);
end
And, in response to a comment below by the original post, this method can be extended to further unpack the data:
Time_Data = [];
count = 0;
for z = 1:2;
for aa = 1:38
count = count+1;
fieldname = ['Time' num2str(count)];
Time_Data.(fieldname) = dataCOMB{1,z}{1,aa}(1:1024,1);
end
end

Concatenating 1D matrices of different sizes

I perhaps am going about this wrong, but I have data{1}, data{2}...data{i}. Within each, I have .type1, .type2.... .typeN. The arrays are different lengths, so horizontal concatenation does not work.
For simplicity sake
>> data{1}.type1
ans =
1
2
3
>> data{2}.type1
ans =
2
4
5
6
Results should be [1;2;3;2;4;5;6]
I've been trying to loop it but not sure how? I will have a variable number of files (a,b..). How do I go about looping and concatenating? Ultimately I need a 1xN array of all of this..
My working code, thanks..figured it out..
for i = 1:Types
currentType = nTypes{i}
allData.(currentType)=[];
for j = 1:nData
allData.(currentType) = [allData.(currentType); data{j}.(currentType)(:,3)]; %3rd column
end
end
Look at cat, the first argument is the dimension. In your simple example it would be:
result = cat(1,a,b);
Which is equivalent to:
result = [a;b];
Or you can concatenate them as row vectors and transpose back to a column vector:
result = [a',b']';
For the case of a structure inside a cell array I don't think there will be any way around looping. Let's say you have a cell array with M elements and N "types" as the structure fields for each element. You could do:
M=length(data);
newData=struct;
for i=1:M
for j=1:N
field=sprintf('type%d',j); % //field name
if (M==1), newData.(field)=[]; end % //if this is a new field, create it
newData.(field)=[newData.(field);data{i}.(field)];
end
end

files comparison

I'm trying to compare 2 txt files to check files are equals otherwise, get the output and give difference (say that there are a diff line x)
I'm trying as follows:
fid1 = fopen(file_1, 'r');
fid2 = fopen(file_2, 'r');
lines1 = textscan(fid1,'%s','delimiter','\n');
lines2 = textscan(fid2,'%s','delimiter','\n');
lines1 = lines1{1};
lines2 = lines2{1};
fclose(fid1);
fclose(fid2);
tf = isequal(lines1,lines2); % this gives 0 or 1
I would like when the value is 0 (files are different) to localize the diff and give line where files are different or print content of difference.
You essentially want to compare each element of the two cell arrays you have, not the whole cell arrays. You could do that with a loop in most languages, but of course MATLAB has many ways to avoid loops. Here, it is cellfun:
cellfun(#isequal,lines1,lines2)
(I left out the part where, if the two cell arrays are of unequal size, you have to shorten the longer one.) Then, find is useful for finding the first (or all) occurrence(s) of a certain value in any vector.

Resources