MATLAB excluding data outside 1 standard deviation

MATLAB excluding data outside 1 standard deviation - arrays

I'm inexperienced with MATLAB, so sorry for the newbie question:
I've got a large vector (905350 elements) storing a whole bunch of data in it.
I have the standard deviation and mean, and now I want to cut out all the data points that are above/below one standard deviation from the mean.
I just have no clue how. From what I gather I have to make a double loop of some sort?
It's like: mean-std < data i want < mean + std

If the data is in variable A, with the mean stored in meanA and the standard deviation stored in stdA, then the following will extract the data you want while maintaining the original order of the data values:
B = A((A > meanA-stdA) & (A < meanA+stdA));
Here are some helpful documentation links that touch on the concepts used above: logical operators, matrix indexing.

You can simply use the Element-wise logical AND:
m = mean(A);
sd = std(A);
B = A( A>m-sd & A<m+sd );
Also, knowing that: |x|<c iff -c<x<c, you can combine both into one as:
B = A( abs(A-m)<sd );

Taking A as your original vector, and B as the final one:
B = sort(A)
B = B(find(B > mean-std,1,'first'):find(B < mean+std,1,'last'))

y = x(x > mean-std);
y = y(y < mean+std);
should work. See FIND for more details. The FIND command is being used implicitly in the above code.

Related

How do I split results into separate variables in matlab?

I'm pretty new to matlab, so I'm guessing there is some shortcut way to do this but I cant seem to find it
results = eqs\soltns;
A = results(1);
B = results(2);
C = results(3);
D = results(4);
E = results(5);
F = results(6);
soltns is a 6x1 vector and eqs is a 6x6 matrix, and I want the results of the operation in their own separate variables. It didn't let me save it like
[A, B, C, D, E, F] = eqs\soltns;
Which I feel like would make sense, but it doesn't work.

Up to now, I have never come across a MATLAB function doing this directly (but maybe I'm missing something?). So, my solution would be to write a function distribute on my own.
E.g. as follows:
result = [ 1 2 3 4 5 6 ];
[A,B,C,D,E,F] = distribute( result );
function varargout = distribute( vals )
assert( nargout <= numel( vals ), 'To many output arguments' )
varargout = arrayfun( #(X) {X}, vals(:) );
end
Explanation:
nargout is special variable in MATLAB function calls. Its value is equal to the number of output parameters that distribute is called with. So, the check nargout <= numel( vals ) evaluates if enough elements are given in vals to distribute them to the output variables and raises an assertion otherwise.
arrayfun( #(X) {X}, vals(:) ) converts vals to a cell array. The conversion is necessary as varargout is also a special variable in MATLAB's function calls, which must be a cell array.
The special thing about varargout is that MATLAB assigns the individual cells of varargout to the individual output parameters, i.e. in the above call to [A,B,C,D,E,F] as desired.
Note:
In general, I think such expanding of variables is seldom useful. MATLAB is optimized for processing of arrays, separating them to individual variables often only complicates things.
Note 2:
If result is a cell array, i.e. result = {1,2,3,4,5,6}, MATLAB actually allows to split its cells by [A,B,C,D,E,F] = result{:};

One way as long as you know the size of results in advance:
results = num2cell(eqs\soltns);
[A,B,C,D,E,F] = results{:};
This has to be done in two steps because MATLAB does not allow for indexing directly the results of a function call.
But note that this method is hard to generalize for arbitrary sizes. If the size of results is unknown in advance, it would probably be best to leave results as a vector in your downstream code.

Array of sets in Matlab

Is there a way to create an array of sets in Matlab.
Eg: I have:
a = ones(10,1);
b = zeros(10,1);
I need c such that c = [(1,0); (1,0); ...], i.e. each set in c has first element from a and 2nd element from b with the corresponding index.
Also is there some way I can check if an unknown set (x,y) is in c.
Can you all please help me out? I am a Matlab noob. Thanks!

There are not sets in your understanding in MATLAB (I assume that you are thinking of tuples on Python...) But there are cells in MATLAB. That is a data type that can store pretty much anything (you may think of pointers if you are familiar with the concept). It is indicated by using { }.
Knowing this, you could come up with a cell of arrays and check them using cellfun
% create a cell of numeric arrays
C = {[1,0],[0,2],[99,-1]}
% check which input is equal to the array [1,0]
lg = cellfun(#(x)isequal(x,[1,0]),C)
Note that you access the address of a cell with () and the content of a cell with {}. [] always indicate arrays of something. We come to this in a moment.
OK, this was the part that you asked for; now there is a bonus:
That you use the term set makes me feel that they always have the same size. So why not create an array of arrays (or better an array of vectors, which is a matrix) and check this matrix column-wise?
% array of vectors (there is a way with less brackets but this way is clearer):
M = [[1;0],[0;2],[99;-1]]
% check at which column (direction ",1") all rows are equal to the proposed vector:
lg = all(M == [0;2],1)
This way is a bit clearer, better in terms of memory and faster.
Note that both variables lg are arrays of logicals. You can use them directly to index the original variable, i.e. M(:,lg) and C{lg} returns the set that you are looking for.

If you would like to get logical value regarding if p is in C, maybe you can try the code below
any(sum((C-p).^2,2)==0)
or
any(all(C-p==0,2))
Example
C = [1,2;
3,-1;
1,1;
-2,5];
p1 = [1,2];
p2 = [1,-2];
>> any(sum((C-p1).^2,2)==0) # indicating p1 is in C
ans = 1
>> any(sum((C-p2).^2,2)==0) # indicating p2 is not in C
ans = 0
>> any(all(C-p1==0,2))
ans = 1
>> any(all(C-p2==0,2))
ans = 0

In MATLAB: How should nested fields of a struct be converted to a cell array?

In MATLAB, I would like to extract a nested field for each index of a 1 x n struct (a nonscalar struct) and receive the output as a 1 x n cell array. As a simple example, suppose I start with the following struct s:
s(1).f1.fa = 'foo';
s(2).f1.fa = 'yedd';
s(1).f1.fb = 'raf';
s(2).f1.fb = 'da';
s(1).f2 = 'bok';
s(2).f2 = 'kemb';
I can produce my desired 1 x 2 cell array c using a for-loop:
n = length(s);
c = cell(1,n);
for k = 1:n
c{k} = s(k).f1.fa;
end
If I wanted to do analogously for a non-nested field, for example f2, then I could "vectorize" the operation (see this question), writing simply:
c = {s.f2};
However the same approach does not appear to work for nested fields. What then are possible ways to vectorize the above for-loop?

You cannot really vectorize it. The problem is that Matlab does not allow most forms of nested indexing, including []..
The most concise / readable option would be to concatenate s.f1 results in a structure array using [...], and then index into the new structure array with a separate call:
x = [s.f1]; c = {x.fa};
If you have a Mapping Toolbox, you could use extractfield to perform the second indexing in one expression:
c = extractfield([s.f1], 'fa');
Alternatively you could write a one-liner using arrayfun - here's a couple of options:
c = arrayfun(#(x) x.f1.fa, s, 'uni', false);
c = arrayfun(#(x) x.fa, [s.f1], 'uni', false);
Note that arrayfun and similar functions are generally slower than explicit for loops. So if the performance is critical, time / profile your code, before making a decision to get rid of the loop.

I am looking for an elegant means of extracting nested data from a MATLAB data structure

Using MATLAB, other than the brute force technique of using nested FOR loops, I am curious if there is a more elegant means of extracting the X & Y data from the sample data structure that I have shown below. I haven't been able to devise an elegant way of doing this in MATLAB using bsxfun, arrayfun, or strucfun.
% Create an example of the input structure that I need to parse
for i =1:100
setName = ['n' num2str(i)];
for j = 1:randi(10,1)
repName = ['n' num2str(j)];
data.sets.(setName).replicates.(repName).X = i + randn();
data.sets.(setName).replicates.(repName).Y = i + randn();
end
end
clearvars -except data
% Brute force technique using nested FOR Loops to extract X & Y from this
% nested structure for easy plotting. Is there a better way to extract the
% X & Y values created above without using FOR loops?
n = 1;
setNames = fieldnames(data.sets);
for i =1:length(setNames)
replicateNames = fieldnames(data.sets.(setNames{i}).replicates);
for j = 1:length(replicateNames)
X(n) = data.sets.(setNames{i}).replicates.(replicateNames{j}).X;
Y(n) = data.sets.(setNames{i}).replicates.(replicateNames{j}).Y;
n = n+1;
end
end
scatter(X,Y);

MATLAB works best with arrays/matrices (be it numeric arrays, struct arrays, cell arrays, object arrays, etc..). The language offers constructs to slice and index into arrays easily.
So the idiomatic way in MATLAB would have been to create a non-scalar structure array, as opposed to a deeply nested structure.
For example lets first convert the nested structure into an 2D array of structures, where the first dimension denotes the "replicates", and the second dimension denotes the "sets":
ds = struct('X',[], 'Y',[]);
sets = fieldnames(data.sets);
for i=1:numel(sets)
reps = fieldnames(data.sets.(sets{i}).replicates);
for j=1:numel(reps)
ds(j,i) = data.sets.(sets{i}).replicates.(reps{j});
end
end
The result is a 10-by-100 structure array, each with two fields X and Y:
>> ds
ds =
10x100 struct array with fields:
X
Y
Accessing data.sets.n99.replicates.n9 in the original structure would be equivalent to ds(9,99) in the new structure.
>> data.sets.n99.replicates.n9
ans =
X: 100.3616
Y: 98.8023
>> ds(9,99)
ans =
X: 100.3616
Y: 98.8023
This new struct has the benefit that it can easily be accessed using array-indexing notation and comma-separated lists. So we can to extract the X and Y vectors like you did simply as:
XX = [ds.X]; % or XX = cat(2, ds.X)
YY = [ds.Y];
scatter(XX, YY, 1)
So if you had control over building the struct, I would design it as described above to begin with. Otherwise the double for-loop in your code with the dynamic field names is the best way to extract the values from it.
You could probably write a bunch of structfun called on each other, but that won't be the most readable code. Here is what I came up with to flatten the nested structure:
D = structfun(#(n) ...
structfun(#(nn) [nn.X nn.Y], n.replicates, 'UniformOutput',false), ...
data.sets, 'UniformOutput',false);
The resulting structure can be accessed with less nested fields:
>> D.n99.n9
ans =
100.3616
98.8023
Slightly better the original one, but still not easily traversed without some for-loops.

Since we often are "given" deeply nested structures from sources we can't control (other business units, customers, etc.), sometimes a baby's gotta do what a baby's gotta do. Here's a hack that seems to work to completely flatten a nested structure. Also posted to here just in case one of these questions gets deleted. . Copyright Carl Witthoft under usual GPL-3 rules.
% struct2sims converter
function simout = struct2sims(structin)
fnam = fieldnames(structin);
for jf = 1:numel(fnam)
subnam = [inputname(1),'_',fnam{jf}];
if isstruct(structin.(fnam{jf}) ) ,
% need to dive; build a new variable that's not a substruct
eval(sprintf('%s = structin.(fnam{jf});', fnam{jf}));
eval(sprintf('simtmp = struct2sims(%s);',fnam{jf}) );
% try removing the struct before getting any farther...
simout.(subnam) = simtmp;
else
% at bottom, ok
simout.(subnam) = structin.(fnam{jf});
end
end
% need to unpack structs here, after each level of recursion
% returns...
subfnam = fieldnames(simout);
for kf = 1:numel(subfnam)
if isstruct(simout.(subfnam{kf}) ),
subsubnam = fieldnames(simout.(subfnam{kf}));
for fk = 1:numel(subsubnam)
simout.([inputname(1),'_',subsubnam{fk}])...
= simout.(subfnam{kf}).(subsubnam{fk}) ;
end
simout = rmfield(simout,subfnam{kf});
end
end
% if desired write to file with:
% save('flattened','-struct','simout');
end

MATLAB: vectorize filling of 3D-array

I would like to safe a certain amount of grayscale-images (->2D-arrays) as layers in a 3D-array.
Because it should be very fast for a realtime-application I would like to vectorize the following code, where m is the number of shifts:
for i=1:m
array(:,:,i)=imabsdiff(circshift(img1,[0 i-1]), img2);
end
nispio showed me a very advanced version, which you can see here:
I = speye(size(img1,2)); E = -1*I;
ii = toeplitz(1:m,[1,size(img1,2):-1:2]);
D = vertcat(repmat(I,1,m),E(:,ii));
data_c = shape(abs([double(img1),double(img2)]*D),size(data_r,1),size(data_r,2),m);
At the moment the results of both operations are not the same, maybe it shifts the image into the wrong direction. My knowledge is very limited, so I dont understand the code completely.

You could do this:
M = 16; N = 20; img1 = randi(255,M,N); % Create a random M x N image
ii = toeplitz(1:N,circshift(fliplr(1:N)',1)); % Create an indexing variable
% Create layers that are shifted copies of the image
array = reshape(img1(:,ii),M,N,N);
As long as your image dimensions don't change, you only ever need to create the ii variable once. After that, you can call the last line each time your image changes. I don't know for sure that this will give you a speed advantage over a for loop, but it is vectorized like you requested. :)
UPDATE
In light of the new information shared about the problem, this solution should give you an order of magnitudes increase in speed:
clear all;
% Set image sizes
M = 360; N = 500;
% Number of column shifts to test
ncols = 200;
% Create comparison matrix (see NOTE)
I = speye(N); E = -1*I;
ii = toeplitz([1:N],[1,N:-1:(N-ncols+2)]);
D = vertcat(repmat(I,1,ncols),E(:,ii));
% Generate some test images
img1 = randi(255,M,N);
img2 = randi(255,M,N);
% Compare images (vectorized)
data_c = reshape(abs([img2,img1]*D),M,N,ncols);
% Compare images (for loop)
array = zeros(M,N,ncols); % <-- Pre-allocate this array!
for i=1:ncols
array(:,:,i)=imabsdiff(circshift(img1,[0 i-1]),img2);
end
This uses matrix multiplication to do the comparisons instead of generating a whole bunch of shifted copies of the image.
NOTE: The matrix D should only be generated one time if your image size is not changing. Notice that the D matrix is completely independent of the images, so it would be wasteful to regenerate it every time. However, if the image size does change, you will need to update D.
Edit: I have updated the code to more closely match what you seem to be looking for. Then I throw the "original" for-loop implementation in to show that they give the same result. One thing worth noting about the vectorized version is that it has the potential to be very memory instensive. If ncols = N then the D matrix has N^3 elements. Even though D is sparse, things fall apart fast when you multiply D by the non-sparse images.
Also, notice that I pre-allocate array before the for loop. This is always good practice in Matlab, where practical, and it will almost invariably give you a large performance boost over the dynamic sizing.

If question is understood correctly, I think you need for loop
for v=1:1:20
array(:,:,v)=circshift(image,[0 v]);
end

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

MATLAB excluding data outside 1 standard deviation - arrays

You can simply use the Element-wise logical AND: m = mean(A); sd = std(A); B = A( A>m-sd & A<m+sd ); Also, knowing that: |x|<c iff -c<x<c, you can combine both into one as: B = A( abs(A-m)<sd );

Taking A as your original vector, and B as the final one: B = sort(A) B = B(find(B > mean-std,1,'first'):find(B < mean+std,1,'last'))

y = x(x > mean-std); y = y(y < mean+std); should work. See FIND for more details. The FIND command is being used implicitly in the above code.

Related

How do I split results into separate variables in matlab?

Array of sets in Matlab

In MATLAB: How should nested fields of a struct be converted to a cell array?

I am looking for an elegant means of extracting nested data from a MATLAB data structure

MATLAB: vectorize filling of 3D-array

Categories

Resources