Automatically assign number to each string in array in Matlab - arrays

I have a large cell array in Matlab (imported from Excel) containing numbers and strings.
Let's say the string part looks like this, just bigger with many columns and lines:
Table{1,1} = 'string A'
Table{2,1} = 'string B'
Table{3,1} = 'string B'
And the number part looks like this just bigger:
Table{1,2} = 5;
Table{2,2} = 10;
Table{3,2} = 15;
I am aware that there are disadvantages of working with arrays (right?), so I consider converting EVERYTHING to a numeric matrix by replacing the strings with numbers. (Possibly as a data set with headings - if you don't advise against that?)
My problem is that I have A LOT of different string entries, and I want to automatically assign a number to each entry, e.g. 1 for 'string A', 2 for 'string B' etc., such that:
Matrix(1,1) = 1
Matrix(2,1) = 2
Matrix(3,1) = 2
etc.
and for the numbers simply:
Matrix(1,2) = Table{1,2};
Matrix(2,2) = Table{2,2};
Matrix(3,2) = Table{3,2};
For the strings, I cannot assign the numbers by individual code for each string, because there are so many different string entries. Is there a way to "automate" it?
I am aware of this help site, https://ch.mathworks.com/help/matlab/matlab_prog/converting-from-string-to-numeric.html, but haven't found anything else helpful.
How would you do it?

Find the indices of both numbers and character entries in your cell array using isnumeric (or ischar) with cellfun. Then use third output argument of unique (or findgroups which requires R2015b) for assigning numbers to character entries of your cell array. Now just put the numbers into your required matrix as shown below:
tmp = cellfun(#isnumeric,Table); %Indices of Numbers
Matrix = zeros(size(Table)); %Initialising the matrix
[~, ~, ic] = unique(Table(~tmp)); %Assigning numbers to characters
Matrix(~tmp) = ic; %Putting numbers for characters
%Above two lines can be replaced with Matrix(~tmp) = findgroups(Table(~tmp)); in R2015b
Matrix(tmp) = [Table{tmp}]; %Putting numbers as they are

Related

How can I change the data type of a field in a structure array?

I have a 2417-by-50 structure array in MATLAB and am trying to find a vectorized way to convert some of the field types:
I have a column of characters that I want to convert into a string type:
[DataS.Sector] = string([DataS.Sector]);
but its not working. I don't want to use a loop since it take so much time.
Same issue, but converting to numeric values. Right now I'm using a loop that takes a really long time:
for i = 1:length(DataS)
for j = 1:numel(Vectorpour)
DataS(i).(DataSfieldname{k}) = str2double(DataS(i).(DataSfieldname{k}))
end
end
How can I vectorize each of these approaches?
You can perform both of these conversions across all elements of your structure array by capturing the field values in a cell array, doing the conversion (using string or str2double), converting the result to a cell array using num2cell, then overwriting the original fields using a comma-separated list:
% For part A:
temp = num2cell(string({DataS.Sector}));
[DataS.Sector] = temp{:};
% For part B:
temp = num2cell(str2double({DataS.(DataSfieldname{k})}));
[DataS.(DataSfieldname{k})] = temp{:};

Array intersection issue (Matlab)

I am trying to carry out the intersection of two arrays in Matlab but I cannot find the way.
The arrays that I want to intersect are:
and
I have tried:[dur, itimes, inewtimes ] = intersect(array2,char(array1));
but no luck.
However, if I try to intersect array1 with array3 (see array3 below), [dur, itimes, inewtimes ] = intersect(array3,char(array1));the intersection is performed without any error.
Why I cannot intersect array1 with array2?, how could I do it?. Thank you.
Just for ease of reading, your formats for Arrays are different, and you want to make them the same. There are many options for you, like #Visser suggested, you could convert the date/time into a long int which allows faster computation, or you can keep them as strings, or even convert them into characters (like what you have done with char(Array2)).
This is my example:
A = {'00:00:00';'00:01:01'} %//Type is Cell String
Z = ['00:00:00';'00:01:01'] %//Type is Cell Char
Q = {{'00:00:00'};{'00:01:01'}} %//Type is a Cell of Cells
A = cellstr(A) %//Convert CellStr to CellStr is essentially doing nothing
Z = cellstr(Z) %//Convert CellChar to CellStr
Q = vertcat(Q{:,:}) %// Convert Cell of Cells to Cell of Strings
I = intersect (A,Z)
>>'00:00:00'
'00:01:01'
II = intersect (A,Q)
>>'00:00:00'
'00:01:01'
This keeps your dates in the format of Strings in case you want to export them back into a txt/csv file.
Your first array would look something like this:
array1 = linspace(0,1,86400); % creates 86400 seconds in 1 day
Your second array should be converted using datenum, then use cell2mat to make it a matrix. Lastly, use ismember to find the intersection:
InterSect = ismember(array2,array1);

Matlab: Delete the item in an N-dimensional array whose Nth dimension is 1, where N is unknown?

I have an N-dimensional array of items whose last dimension is the index of the array.
For example, if the array A contained images, then A(:,:,:,1) would be the first image, A(:,:,:,2) would be the second image, and so forth.
Similarly, if the array just contained integers, then A(:,1) would be the first integer, A(:,2) would be the second integer, and so forth.
-=-=-=-
What I'm trying to do is delete the first item from A when I do not know ahead of time what dimensionality it is.
If A contains images, I want to do this:
A(:,:,:,1) = [];
If A contains integers, I want to do this:
A(:,1) = [];
The problem is since I don't know what dimensionality it is, I don't know how many colons to put, and I don't know how to denote "N-1 colons here" in Matlab.
I'm hoping there is a programmatic way to do this, but I frankly have no idea what to search for if this is possible.
You can either use cell to comma-separated list expansion:
%// Build cell: {':', ':', ..., ':', [1]}
I(1:ndims(A)-1) = {':'};
I{ndims(A)} = 1;
%// Expand cell to comma separated list and delete:
A(I{:}) = [];
Or convert to cell using num2cell and then convert back using cell2mat:
C = num2cell(A,1:ndims(A)-1);
A = cell2mat(C(2:end));
I guess that unless you really need n-dimensional arrays, doing this with a cell array of n-1 dimensional arrays instead (as is C in the above code) should be a smart move in terms of simplicity of notation.

Concatenating 1D matrices of different sizes

I perhaps am going about this wrong, but I have data{1}, data{2}...data{i}. Within each, I have .type1, .type2.... .typeN. The arrays are different lengths, so horizontal concatenation does not work.
For simplicity sake
>> data{1}.type1
ans =
1
2
3
>> data{2}.type1
ans =
2
4
5
6
Results should be [1;2;3;2;4;5;6]
I've been trying to loop it but not sure how? I will have a variable number of files (a,b..). How do I go about looping and concatenating? Ultimately I need a 1xN array of all of this..
My working code, thanks..figured it out..
for i = 1:Types
currentType = nTypes{i}
allData.(currentType)=[];
for j = 1:nData
allData.(currentType) = [allData.(currentType); data{j}.(currentType)(:,3)]; %3rd column
end
end
Look at cat, the first argument is the dimension. In your simple example it would be:
result = cat(1,a,b);
Which is equivalent to:
result = [a;b];
Or you can concatenate them as row vectors and transpose back to a column vector:
result = [a',b']';
For the case of a structure inside a cell array I don't think there will be any way around looping. Let's say you have a cell array with M elements and N "types" as the structure fields for each element. You could do:
M=length(data);
newData=struct;
for i=1:M
for j=1:N
field=sprintf('type%d',j); % //field name
if (M==1), newData.(field)=[]; end % //if this is a new field, create it
newData.(field)=[newData.(field);data{i}.(field)];
end
end

Breaking up a string and then converting to a vector?

If I have three strings, such as 'Y20194', '219Y42', and '12345' how do I break these up into a vector like [Y 2 0 1 9 4], [1 2 3 4 5], and [2 1 9 Y 4 2]? I am using str2num, but I think I am missing a step (separating the individual numbers in the strings first) before I convert to numerical values. Also, the characters aren't reading correctly and using str2num gives me [].
I have a file with lines of strings such as the one above. I used fgetl to read each line of my file into strings but am kind of stuck beyond that.
You cannot have both characters and numbers in a numerical vector.
You can do the following:
s = 'Y20194';
c = cellstr(s')';
v = str2double(c);
Cell array c will have all the characters from s separated in to individual cells. Notice that you have to transpose the string s first.
In vector v the first value will be NaN since it's a character.
The char will be kept. and the numbers will be converted to double type.
If the input is not from reading a file, the code is as follows. The result1 is the cell containing the array you want:
If the input is one file, let's take this file as example: demo1.txt, which content as follows:
the codes to convert each line to what you want as follows. the code converts each line into what you want and then display it.
If you want to replace the 'Y' or other alphabets with zero, then the code will be as follows
Maybe STRSPLIT will help.
ts = strsplit('Y20194');
% ts <- {'Y', '2', '0', '1', '9', '4'}
And now you can try to convert each element in the vector individually to a number using str2num.
N = size(ts, 1);
str = cell(1, N);
for i=1:N;
str{i} = str2num(ts{i, 1});
end
But since some of the characters in the string aren't numbers (e.g., 'Y'), I wouldn't expect this to work perfectly.
(Its been a while, some of my indexes may be switched.)
If you want to change the Y into 0 a very simple solution is available:
str = 'Y20194';
str(str==Y)='0';
str - '0'

Resources