Can characters be used as indices? - arrays

Let's define, for example,
x = 10:10:2000;
As is well known, integer values can be used as indices:
>> x(9)
ans =
90
In Matlab, characters can often be used where a number would be expected, with Matlab doing the conversion automatically. For example, since the ASCII code of 'a' is 97,
>> 'a'+1
ans =
98
Can characters be also used as indices? Does Matlab convert them into integers?

They can be used... but careful if the index is a single colon!
Let's define
>> x = 10:10:2000;
Indexing with 'a' produces the 97-th element of x, as expected:
>> x('a')
ans =
970
However, indexing with ':' is a special case. The string ':' acts as a : index, thus producing a column vector of all values of x. That is, x(':') is the same as x(:):
>> x(':')
ans =
10
20
30
...
1990
2000
This means that the index ':' is being evaluated (x(':') acts like x(:)), whereas other character arrays used as indices are not evaluated (x('a') doesn't act like x(a)):
>> a = 1;
>> x('a')
ans =
970
This also implies that with ':', converting to a numeric type before indexing does matter, unlike with other characters used as indices:
>> x(double('abc'))
ans =
970 980 990
>> x('abc')
ans =
970 980 990
>> x(double(':'))
ans =
580
>> x(':')
ans =
10
20
30
...
1990
2000
The "evaluated" behaviour of ':' used as index was already known. What's surprising is the contrast with other characters or character arrays used as indices (which are not evaluated).
The examples have used a single dimension for simplicity, but the described behaviour also applies to multidimensional indexing. The same behaviour is observed in Octave too.

Related

Loading arrays of different sizes into a single array

I have 100 arrays with the dimension of nx1. n varies from one array to the next (e.g, n1 = 50, n2 = 52, n3 = 48 etc.). I would like to combine all these arrays into a single one with the dimension of 100 x m with m being the max of n's.
The issue I am running into is that as n varies, Matlab will throw out an error says that the dimensions mismatch. Is there a way to get around this so I can pad "missing" cell with N/A? For instance, if the first array contains 50 elements (i.e., n1 = 50) like this:
23
31
6
...
22
the second array contains 52 elements (i.e., n2 = 52) like this:
25
85
41
...
8
12
66
The result should be:
23 25
31 85
6 41
... ...
22 8
N/A 12
N/A 66
Thanks to the community in advance!
Here is another approach without eval.
array_lengths = cellfun(#numel, arrays);
max_length = max(array_lengths);
result = nan(max_rows, num_arrays);
for r=1:num_arrays
result(1:array_lengths(r),r) = arrays{r}(1:array_lengths(r));
end
Some explanation: I'm assuming your arrays are stored in a cell to begin with. Here is some code to generate fictitious data with the dimensions you gave:
% some dummy data for your arrays.
num_arrays = 100;
primerArrayCell = num2cell(ones(1,num_arrays)); % , 1, ones(1, num_arrays));
arrays = cellfun(#(c) rand(randi(50, 1),1), primerArrayCell, 'uniformoutput',false);
You can use cellfun with an anonymous function to get the lengths of each individual array:
% Assume your arrays are in a cell of arrays with the variable name arrays
array_lengths = cellfun(#numel, arrays);
max_length = max(array_lengths);
Allocate an array of nan values to store your result
% initialize your data to nan's.
result = nan(max_rows, num_arrays);
Then fill in the non-nan values based on the length of the arrays calculated previously.
for r=1:num_arrays
result(1:array_lengths(r),r) = arrays{r}(1:array_lengths(r));
end
You may want to consider using structure arrays for storing such datasets as it makes everything easier when merging them into a single array.
But to answer your question, if you have arrays like this:
a1 = 1:20; % array of size 1 x 20
n1 = numel(a1); % 20
a2 = 50:60; % array of size 1 x 11
n2 = numel(a2); % 11
... say you have nArrs arrays
Given nArrs arrays for example, you can create the desired matrix res like this:
m = max([n1, n2, .... ]);
res = ones(m,nArrs) * nan; % initialize the result matrix w/ nan
% Manually
res(1:n1,1) = a1.';
res(1:n2,2) = a2.';
% ... so on
% Or use eval instead like this
for i = 1:nArrs
eval(['res(1:n' int2str(i) ', i) = a' int2str(i) '.'';'])
end
Now bear in mind that using eval is NOT recommended but hopefully that just gives you an idea as to what to do. If you did use structures, you can replace eval with something more efficient and robust like arrayfun for instance.

Does MATLAB support truly 1d arrays?

It would really help me reason about my MATLAB code if I didn't have to worry about accidental 2d operations. For instance, if I want to do element-wise multiplication of 1d arrays, but one is a row and another is a column, I end up with a 2d result.
>> a = 1:8;
>> a = a(:);
>> a .* cumsum(ones(8))
ans =
1 1 1 1 1 1 1 1
4 4 4 4 4 4 4 4
9 9 9 9 9 9 9 9
16 16 16 16 16 16 16 16
25 25 25 25 25 25 25 25
36 36 36 36 36 36 36 36
49 49 49 49 49 49 49 49
64 64 64 64 64 64 64 64
I'd like to prevent this type of thing, and likely other problems that I can't foresee, by keeping all my arrays 1d wherever I can. But every time I check the size() of vector, I get at least 2 elements back:
>> size(1:1:6)
ans =
1 6
>> size(linspace(0, 5, 10))
ans =
1 10
I've tried the suggestions at How to create single dimensional array in matlab? and some of the options here (PDF download), and I can't get a "truly" 1d array. How would you deal with this type of issue?
There is no such thing as 1D array. The documentation says (emphasis mine):
All MATLAB variables are multidimensional arrays, no matter what type of data. A matrix is a two-dimensional array often used for linear algebra.
You may use isvector, isrow and iscolumn to identify vectors, row vectors and column vectors respectively.
#Sardar has already said the last word. Another clue is ndims:
N = ndims(A) returns the number of dimensions in the array A. The
number of dimensions is always greater than or equal to 2. ...
But about your other question:
How would you deal with this type of issue?
There's not much you can do. Debug, find the mistake and fix it. If it's some one-time script, you are done. But if you are writing functions that may be used later, it's better to protect them from accepting arguments with unequal dimensions:
function myFunc(A, B)
if ndims(A)~=ndims(B) || any(size(A)~=size(B))
error('Matrix dimensions must agree.');
end
% ...
end
Or, if your function really needs them to be vectors:
function myFunc(A, B)
if ~isvector(A) || ~isvector(B) || any(size(A)~=size(B))
error('A and B must be vectors with same dimensions.');
end
% ...
end
You can also validate different attributes of arguments using validateattributes:
function myFunc(A, B)
validateattributes(A, {'numeric'},{'vector'}, 'myFunc', 'A')
validateattributes(B, {'numeric'},{'size', size(A)}, 'myFunc', 'B')
% ...
end
Edit:
Also, if the function only needs the inputs to be vectors and their orientation does not matter, you can modify them inside the function (thanks to #CrisLuengo for commenting).
function myFunc(A, B)
if ~isvector(A) || ~isvector(B) || length(A)~=length(B)
error('A and B must be vectors with the same length.');
end
A = A(:);
B = B(:);
% ...
end
However, this is not recommended when the output of the function is also a vector with the same size as the inputs. This is because the caller expects the output to be in the same orientation as the inputs, and if this is not the case, problems may arise.

Counting rows without any NaNs in a struct

I need to count how many structs do not have any NaNs across all fields in an array of structures. The sample struct looks like this:
a(1).b = 11;
a(2).b = NaN;
a(3).b = 22;
a(4).b = 33;
a(1).c = 44;
a(2).c = 55;
a(3).c = 66;
a(4).c = NaN;
The output looks like this
Fields b c
1 44 11
2 55 NaN
3 66 22
4 NaN 33
The structs without NaNs are 1 and 3, so there should be 2 in total here.
I tried using size(a, 2), but it just tells me the total number of structs in the array. I need it to calculate N (the number of observations in the sample). NaNs don't count as observations as they are omitted in the analysis.
What is the simplest way to count structs without any NaNs in a struct array?
I would suggest using the following one-line command:
nnz(~any(cellfun(#isnan,struct2cell(a))))
struct2cell(a) converts your struct into the 3D cell array
cellfun(#isnan,___) applies isnan to each element of cell array
~any(__) works along first dimension and returns arrays that have no NaNs
nnz(__) counts how many rows have no NaNs
The result is just a number, 2 in this case.
The following:
find(~any(cellfun(#isnan,struct2cell(a))))
Would tell you which rows are without NaNs
This will tell you which ones have no NaNs
for ii=1:size(a,2)
hasNoNaNs(ii)=~any(structfun(#isnan,a(ii)));
end
The way it works is iterates trhoug each of the structures, and use structfun to call isnan in each of the elements of it, then checks if any of them is a NaN and negates the result, thus giving 1 in the ones that have no NaNs
Because bsxfun is never the wrong approach!
sum(all(bsxfun(#le,cell2mat(struct2cell(a)),inf)))
How this works:
This converts the struct to a cell, and then to a matrix:
cell2mat(struct2cell(a))
ans(:,:,1) =
11
44
ans(:,:,2) =
NaN
55
ans(:,:,3) =
22
66
ans(:,:,4) =
33
NaN
Then it uses bsxfun to check which of those elements are less than, or equal to zero. The only value that doesn't satisfy this condition is NaN.
bsxfun(#le,cell2mat(struct2cell(a)),inf)
ans(:,:,1) =
1
1
ans(:,:,2) =
0
1
ans(:,:,3) =
1
1
ans(:,:,4) =
1
0
Then, we check if all the values in each of those slices are true:
all(bsxfun(#le,cell2mat(struct2cell(a)),inf))
ans(:,:,1) =
1
ans(:,:,2) =
0
ans(:,:,3) =
1
ans(:,:,4) =
0
And finally, we sum it up:
sum(all(bsxfun(#le,cell2mat(struct2cell(a)),inf)))
ans =
2
(By the way: It's possible to just skip the bsxfun, but where's the fun in that)
sum(all(cell2mat(struct2cell(a))<=inf))
Use arrayfun to iterate over a and structfun to iterate over fields and you get a logical array of elements that do not have NaNs:
>> arrayfun(#(x) ~any(structfun(#isnan, x)), a)
ans =
1 0 1 0
Now you can just sum it
>> sum(arrayfun(#(x) ~any(structfun(#isnan, x)), a))
ans =
2
Taking the idea from this not working answer to use comma separated lists:
s=sum(~any(isnan([[a.b];[a.c]])));
It may look very dumb to hard-code the field names but it leads to fast code because it avoids both iterating and cell arrays.
Generalizing this approach to arbitriary field names, you end up with this solution:
n=true(size(a));
for f = fieldnames(a).'
n(isnan([a.(f{1})]))=false;
end
n=sum(n(:));
Assuming that you have a large struct with only few fieldnames this is very efficient, because it is only iterating the fieldnames.
Third solution maybe - may not be elegant depending on your data:
A = [[a.b];[a.c]]; %//EDIT -- Fixed based on #Daniel's correct solution
IndNotNaN = find (~isnan(A));
Depends if you have lots of structs you will have to concatenate a.b, a.c ....a.n

count elements falling within certain thresholds in array in matlab?

I have a huge vector. I have to count values falling within certain ranges.
the ranges are like 0-10, 10-20 etc. I have to count the number of values which fall in certain range.
I did something like this :
for i=1:numel(m1)
if (0<m1(i)<=10)==1
k=k+1;
end
end
Also:
if not(isnan(m1))==1
x=(0<m1<=10);
end
But both the times it gives array which contains all 1s. What wrong am I doing?
You can do something like this (also works for non integers)
k = sum(m1>0 & m1<=10)
You can use logical indexing. Observe:
>> x = randi(40, 1, 10) - 20
x =
-2 17 -12 -9 -14 -14 15 4 2 -14
>> x2 = x(0 < x & x < 10)
x2 =
4 2
>> length(x2)
ans =
2
and the same done in one step:
>> length(x(0 < x & x < 10))
ans =
2
to count the values in a specific range you can use ismember,
if m1 is vector use,
k = sum(ismember(m1,0:10));
If m1 is matrix use k = sum(sum(ismember(m1,0:10)));
for example,
m1=randi(20,[5 5])
9 10 6 10 16
8 9 14 20 6
16 13 14 7 11
16 15 4 12 14
4 16 3 5 18
sum(sum(ismember(m1,1:10)))
12
Why not simply do something like this?
% Random data
m1 = 100*rand(1000,1);
%Count elements between 10 and 20
m2 = m1(m1>10 & m1<=20);
length(m2) %number of elements of m1 between 10 and 20
You can then put things in a loop
% Random data
m1 = 100*rand(1000,1);
nb_elements = zeros(10,1);
for k=1:length(nb_elements)
temp = m1(m1>(10*k-10) & m1<=(10*k));
nb_elements(k) = length(temp);
end
Then nb_elements contains your data with nb_elements(1) for the 0-10 range, nb_elements(2) for the 10-20 range, etc...
Matlab does not know how to evaluate the combined logical expression
(0<m1(i)<=10)
Insted you should use:
for i=1:numel(m1)
if (0<m1(i)) && (m1(i)<=10)
k=k+1;
end
end
And to fasten it up probably something like this:
sum((0<m1) .* (m1<=10))
Or you can create logical arrays and then use element-wise multiplication. Don't know how fast this is though and it might use a lot of memory for large arrays.
Something like this
A(find((A>0.2 .* (A<0.8)) ==1))
Generate values
A= rand(5)
A =
0.414906 0.350930 0.057642 0.650775 0.525488
0.573207 0.763477 0.120935 0.041357 0.900946
0.333857 0.241653 0.421551 0.737704 0.162307
0.517501 0.491623 0.016663 0.016396 0.254099
0.158867 0.098630 0.198298 0.223716 0.136054
Find the intersection where the values > 0.8 and < 0.2. This will give you two logical arrays and the values where A>0.2 and A<0.8 will be =1 after element-wise multiplication.
find((A>0.2 .* (A<0.8)) ==1)
Then apply those indices to A
A(find((A>0.2 .* (A<0.8)) ==1))
ans =
0.41491
0.57321
0.33386
0.51750
0.35093
0.76348
0.24165
0.49162
0.42155
0.65077
0.73770
0.22372
0.52549
0.90095
0.25410

Logical vs Numerical array in MATLAB

I am comparing two binary arrays. I have an array where values can either be one or zero, one if the values are the same and zero if they are not. Please note I am doing other stuff beyond checking, so we don't need to get into vectorization or the nature of the code.
What is more efficient, using a numerical array or a logical array in MATLAB?
Logical values take up fewer bytes than most numeric values, which is a plus if you're dealing with very large arrays. You can also use logical arrays to do logical indexing. For example:
>> valArray = 1:5; %# Array of values
>> numIndex = [0 1 1 0 1]; %# Numeric array of ones and zeroes
>> binIndex = logical([0 1 1 0 1]); %# Logical array of ones and zeroes
>> whos
Name Size Bytes Class Attributes
binIndex 1x5 5 logical %# 1/8 the number of bytes
numIndex 1x5 40 double %# as a double array
valArray 1x5 40 double
>> b = valArray(binIndex) %# Logical indexing
b =
2 3 5
>> b = valArray(find(numIndex)) %# You have to use the FIND function to
%# find the indices of the non-zero
b = %# values in numIndex
2 3 5
One note: If you will be dealing with arrays of zeroes and ones that are very sparse (i.e. very few ones), it may be best to use an array of numeric indices such as you would get from the FIND function.Take the following example:
>> binIndex = false(1,10000); %# A 1-by-10000 logical array
>> binIndex([2 100 1003]) = true; %# Set 3 values to true
>> numIndex = find(binIndex) %# Find the indices of the non-zero values
numIndex =
2 100 1003
>> whos
Name Size Bytes Class Attributes
binIndex 1x10000 10000 logical %# 10000 bytes versus
numIndex 1x3 24 double %# many fewer bytes
%# for a shorter array
Logical of course! Matlab has the option of squeezing 8 items into 1 byte. (Whether it does or not is another matter).
a=ones(1000); b=(a==1);
tic;for(k=1:100)for(i=1:1000);for(j=1:1000);a(i,j)=a(i,j);end;end;end;toc
tic;for(k=1:100)for(i=1:1000);for(j=1:1000);b(i,j)=b(i,j);end;end;end;toc
result
4.561173 seconds
3.454697 seconds
but the benefit will be much greater if you're doing more logical operations rather than just looping!

Resources