I have a matrix A, which is m x n. What I want to do is count the number of NaN elements in a row. If the number of NaN elements is greater than or equal to some arbitrary threshold, then all the values in that row will set to NaN.
num_obs = sum(isnan(rets), 2);
index = num_obs >= min_obs;
Like I say I am struggling to get my brain to work. Being trying different variations of the line below but no luck.
rets(index==0, :) = rets(index==0, :) .* NaN;
The Example data for threshold >= 1 is:
A = [-7 -8 1.6 11.9;
NaN NaN NaN NaN;
5.5 6.3 2.1 NaN;
5.5 4.2 2.2 5.6;
NaN NaN NaN NaN];
and the result I want is:
A = [-7 -8 1.6 11.9;
NaN NaN NaN NaN;
NaN NaN NaN NaN;
5.5 4.2 2.2 5.6;
NaN NaN NaN NaN];
Use
A = magic(4);A(3,3)=nan;
threshold=1;
for ii = 1:size(A,1) % loop over rows
if sum(isnan(A(ii,:)))>=threshold % get the nans, sum the occurances
A(ii,:)=nan(1,size(A,2)); % fill the row with column width amount of nans
end
end
Results in
A =
16 2 3 13
5 11 10 8
NaN NaN NaN NaN
4 14 15 1
Or, as #Obchardon mentioned in his comment you can vectorise:
A(sum(isnan(A),2)>=threshold,:) = NaN
A =
16 2 3 13
5 11 10 8
NaN NaN NaN NaN
4 14 15 1
As a side-note you can easily change this to columns, simply do all indexing for the other dimension:
A(:,sum(isnan(A),1)>=threshold) = NaN;
Instead of isnan function, you can use A ~= A for extracting NaN elements.
A(sum((A ~= A),2) >= t,:) = NaN
where t is the threshold for the minimum number of existing NaN elements.
Related
My RHSvec is a 51X21 matrix. kdpolind is 11X51X21. Doing the following:
[RHSval,kprimeind] = max(RHSvec,[],2);
gives me a 51X1 RHSval and a 51X1 kprimeind.
if kprimeind is as follows:
16
20
20
16
20
16
16
then I want to store in kprimeind in kdpolind as
kdpolind(act,1,16)
kdpolind(act,2,20)
kdpolind(act,3,20)
kdpolind(act,4,16)
...
I am unable to do this due to dimensions mismatch. Is there a simple way of doing this?
Thanks!
If I understand you correctly, you want something like this:
An example of how to insert a matrix of a different size into another matrix
sub = randn(2,3); % Will give a random matrix of 2 rows and 3 columns
M = nan(3,4,5); % Creates a nan matrix of 3 by 4 by 5
M(2,2+(1:size(sub,1)),2+(1:size(sub,2))) = sub % Inserts the sub matrix into M with an offset of 2 (can be set to 0 for no offset)
will give:
M(:,:,1) =
NaN NaN NaN NaN
NaN NaN NaN NaN
NaN NaN NaN NaN
M(:,:,2) =
NaN NaN NaN NaN
NaN NaN NaN NaN
NaN NaN NaN NaN
M(:,:,3) =
NaN NaN NaN NaN
NaN NaN 0.3252 -0.7549
NaN NaN NaN NaN
M(:,:,4) =
NaN NaN NaN NaN
NaN NaN 1.3703 -1.7115
NaN NaN NaN NaN
M(:,:,5) =
NaN NaN NaN NaN
NaN NaN -0.1022 -0.2414
NaN NaN NaN NaN
I have two dataset arrays, A and B. They are two different, independent measurements (e.g. smell and color of some object).
For each data entry in A and B, I have a time, t, and a location, p of the measurement. The majority of the smell and color measurements were taken at the same time and location. However, there are some times where data is missing (i.e. at some time there was no color measurement and only a smell measurement). Similarly, there are some locations where some data is missing (i.e. at some location there was only color measurement and no smell measurement).
I want to build arrays of A and B which have the same size where each row corresponds to a full set of all times and each column corresponds to a full set of all locations. If there is data missing, I want that entry to be NaN.
Below is an example of what I want to do:
%Inputs
A = [0 0 1 2 4; 1 1 3 3 2; 4 4 1 0 3];
t_A = [0.03 1.6 3.9]; %Times when A was measured (rows of A)
L_A = [1.0 2.9 2.98 4.2 6.33]; %Locations where A was measured (columns of A)
B = [10 13 10 10; 15 13 13 12; 14 14 13 12; 15 19 11 13];
t_B = [0.03 1.6 1.9 3.9]; %Times when B was measured (rows of B)
L_B = [2.1 2.9 2.98 5.0]; %Locations where B was measured (columns of B)
What I want is some code to transform these datasets into the following:
t = [0.03 1.6 1.9 3.9];
L = [1.0 2.1 2.9 2.98 4.2 5.0 6.33];
A_new = [0 NaN 0 1 2 NaN 4; 1 NaN 1 3 3 NaN 2; NaN NaN NaN NaN NaN NaN NaN; 4 NaN 4 1 0 NaN 3];
B_new = [NaN 10 13 10 NaN 10 NaN; NaN 15 13 13 NaN 12 NaN; NaN 14 14 13 NaN 12 NaN; NaN 15 19 11 NaN 13 NaN];
The new arrays, A_new and B_new, are the same size and the vectors t and L (corresponding to the rows and columns) are sequential. The original A had no data at t = 1.9 and thus at the 3rd row in A_new, there is all NaN values. Similarly for the columns 2 and 6 in A_new and columns 1, 5 and 7 in B_new.
How can I do this in MATLAB quickly for a large dataset?
Create a matrix of NaNs , use third output of the unique function to convert floating numbers to integer indexes and use matrix indexing to fill the matrices:
[t,~,it] = unique([t_A t_B]);
[L,~,iL] = unique([L_A L_B]);
A_new = NaN(numel(t),numel(L));
A_new(it(1:numel(t_A)),iL(1:numel(L_A))) = A;
B_new = NaN(numel(t),numel(L));
B_new(it(numel(t_A)+1:end),iL(numel(L_A)+1:end)) = B;
I have a NaN (155*135) matrix, and another matrix showing a specific value with row and column numbers. Is there a way that I can assign these values back to the NaN matrix eventually having the same location and everything else remaining as NaN?
R C Value
19 4 -1133.803
20 4 -295.6810
32 4 -1906.021
20 5 -1027.048
21 5 -293.0065
32 5 236.0525
33 5 -425.1248
Use sub2ind:
data = [
% R C Value
19 4 -1133.803
20 4 -295.6810
32 4 -1906.021
20 5 -1027.048
21 5 -293.0065
32 5 236.0525
33 5 -425.1248];
N = nan(155,135);
N(sub2ind(size(N),data(:,1),data(:,2))) = data(:,3);
So you get for N(min(data(:,1)):max(data(:,1)),min(data(:,2)):max(data(:,2))) (i.e. N(19:32,4:5)):
ans =
-1133.8 NaN
-295.68 -1027
NaN -293.01
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
-1906 236.05
NaN -425.12
You can use accumarray:
result = accumarray([R C] , Value,[155,135],[],NaN)
Note: R and C assumed to be column vectors
I have two vectors 1x5000. They consist of numbers like this:
vec1 = [NaN NaN 2 NaN NaN NaN 5 NaN 8 NaN NaN 7 NaN 5 NaN 3 NaN 4]
vec2 = [NaN 2 NaN NaN 5 NaN NaN NaN 8 NaN 1 NaN NaN NaN 5 NaN NaN NaN]
I would like to check if the order of the numbers are equal, independent of the NaNs. But I do not want to remove the NaNs (Not-a-Number) since I will use them later. So now I create a new vector and call it results. Once they come in the same order, it is correct and we fill results with 1. If the next numbers are not equal we add 0 to results.
An example results would look like this for vec1 and vec2:
[1 1 1 0 1 0 0]
The first 3 numbers are the same, then 7 is compared to 1 which gives 0, then 5 compared to 5 is true which gives 1. Then the last two numbers are missing which gives 0.
One reason that I don't want to remove the NaNs is that I have a time vector 1x500 and somehow I want to get the time for each 1 and 0 (in a new vector). Is that possible too?
Help is super appreciated!
This is how I would do it:
temp1 = vec1(~isnan(vec1));
temp2 = vec2(~isnan(vec2));
m = min(numel(temp1), numel(temp2));
M = max(numel(temp1), numel(temp2));
results = [(temp1(1:m) == temp2(1:m)), false(1,M-m)];
Note that here results is a binary array. If you need it numeric, you can convert it to double for instance.
Regarding your concern about NaNs, depends on what you want to do with your arrays. If you are going to process them, it is more convenient to remove the NaNs. In order to keep the track of things you can keep the index of the kept elements:
id1 = find(~isnan(vec1));
vec1 = vec1(id1);
vec1 =
2 5 8 7 5 3 4
id1 =
3 7 9 12 14 16 18
% and same for vec2
If you decide to remove the NaNs, the solution will be the same, with all temps replaced with vec.
This would be my solution, using a mix of logical indexing and the find function. Returning the timestamps for the 1's and 0's is actually more tedious than finding the 1's and 0's.
vec1 = [NaN NaN 2 NaN NaN NaN 5 NaN 8 NaN NaN 7 NaN 5 NaN 3 NaN 4];
vec2 = [NaN 2 NaN NaN 5 NaN NaN NaN 8 NaN 1 NaN NaN NaN 5 NaN NaN NaN];
t=1:numel(vec1);
ind1=find(~isnan(vec1));
ind2=find(~isnan(vec2));
v1=vec1(ind1);
v2=vec2(ind2);
if length(v1)>length(v2)
ibig=1;
else
ibig=2;
end
n=min(length(v1),length(v2));
N=max(length(v1),length(v2));
v=false(1,N);
v(1:n)=v1(1:n)==v2(1:n);
t_ones1=t(ind1(v));
t_ones2=t(ind2(v));
if ibig==1
t_zeros1=t(ind1(~v));
t_zeros2=t(ind2(~v(1:n)));
else
t_zeros1=t(ind1(~v(1:n)));
t_zeros2=t(ind2(~v));
end
I have a data :
minval = NaN 7 8 9 9 9 10 10 10 10
NaN NaN 10 10 10 10 10 10 10 10
NaN NaN NaN 10 10 9 10 10 10 9
NaN NaN NaN NaN 9 9 10 9 10 10
NaN NaN NaN NaN NaN 9 10 10 10 10
NaN NaN NaN NaN NaN NaN 10 11 10 10
NaN NaN NaN NaN NaN NaN NaN 10 10 10
NaN NaN NaN NaN NaN NaN NaN NaN 10 10
NaN NaN NaN NaN NaN NaN NaN NaN NaN 10
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
and I do this following :
C=size(minval,2);
for e=2:C
D1(1,e)=minval(1,e);
end
D1(D1 == 0) = nan;
for e=3:C
for b=2:e-1
D2(b,e)= minval(b,e)+D1(1,b-1);
D2(D2 == 0) = nan;
[D1(2,e), idx_bt(1,e)]=min(nonzeros(D2(:,e)));
end
end
D1(D1 == 0) = nan;
for e=4:C
for b=3:e-1
D3(b,e)= minval(b,e)+D1(2,b-1);
D3(D3 == 0) = nan;
[D1(3,e), idx_bt(2,e)]=min(nonzeros(D3(:,e)));
end
end
D1(D1 == 0) = nan;
It works well, it gives me a right answer like this :
D1 = NaN 7 8 9 9 9 10 10 10 10
NaN NaN NaN 17 17 16 17 17 17 16
NaN NaN NaN NaN NaN 26 27 26 26 26
and
idx_bt = 0 2 3 4 5 6 7 8 9 10
0 0 1 3 3 3 3 3 3 3
I guess there's a trick to make this code more simple and faster. Is there any help? Thank you.
Crux of the following code revolves around bsxfun, which is supposedly one of the ways to vectorize codes.
Code
%%// Get C
C=size(minval,2);
%%// Declare variables to store required outputs
D1 = NaN(3,C);
idx_bt = zeros(2,C);
%%// --------- STAGE 0 -------------------------
D1(1,2:end) = minval(1,2:C);
%%// --------- STAGE 1 -------------------------
ft1 = bsxfun(#plus,minval(2:C-1,3:C),D1(1,1:C-2)');%%//'
ft1 = [zeros(1,size(ft1,2)) ;ft1];
ft1(ft1==0) = NaN;
D2 = ft1;
[D1(2,3:end) ,idx_bt(1,3:end)] = nanmin(D2);
%%// Probably do not need this given your data, but if you have zeros
%%// alongwith the NaNs and if you are looking to replace
%%// those zeros with NaNs you might. So, it all depends on your data.
%%// This could be looked after later on in the code as well.%%//'
D1(D1 == 0) = NaN;
%%// --------- STAGE 2 -------------------------
ft11 = bsxfun(#plus,minval(3:C-1,4:C),D1(2,2:C-2)');%%//'
ft11 = [zeros(2,size(ft11,2)) ;ft11];
ft11(ft11==0) = NaN;
D3 = ft11;
[D1(3,4:end) ,idx_bt(2,4:end)] = nanmin(D3);
D1(D1 == 0) = NaN;
Output
D1 =
NaN 7 8 9 9 9 10 10 10 10
NaN NaN NaN 17 17 16 17 17 17 16
NaN NaN NaN NaN NaN 26 27 26 26 26
idx_bt =
0 0 1 3 3 3 3 3 3 3
0 0 0 1 1 5 5 7 7 7