Matlab:how to interpolate non motonic data - arrays

I have a question regarding the interpolation of some vectors which can not be monotonic.
The data vectors looks as follow:
x = x1 = y =
20.0000 21.6000 32
21.8000 19.8000 132
22.2000 18.0000 193
21.4000 16.6000 351
20.2000 17.0000 576
20.6000 16.0000 649
20.3000 13.4000 686
19.4000 12.2000 806
16.9000 11.4000 1117
15.8000 11.2000 1252
15.6000 10.9000 1281
15.3000 9.7000 1379
14.8000 9.2000 1527
14.5000 8.7000 1577
12.4000 7.2000 1943
11.8000 5.0000 2047
10.4000 3.0000 2282
5.3000 2.1000 2840
3.5000 2.0000 3047
2.6000 1.8000 3140
(small part)
I would link to achieve 'y1' as interpolation of these using:
y1 = interp1(x,y,x1);
but x and x1 are not monotonic.
y1 should be as long as y
Have you an idea of how to perform the interpolation?

Sort both y and x such as x is monotonic. than sort x1 and use it as presented.
See if the code below helps.
[new_x,indx]=sort(x);
new_y=y(indx);
new_x1=sort(x1);
%solves duplicate entries through the unique function (1) or average entries (2)
[temp_new_x,indx]=unique(new_x);
% (1) discard all repeated x values except the last one
new_y=new_y(indx);
new_x=temp_new_x;
% (2) average repeated entries
new_y = arrayfun(#(C) mean(new_y(C==new_x)),temp_new_x);
new_x=temp_new_x;
y1=interp1(new_x,new_y,new_x1);

Sort the x data and re-index the y data using the results, then interpolate:
[sortedX, IX] = sort(x);
y1 = interp1(sortedX, y(IX), X1);

Related

Python Impute using BayesianRidge() sklearn impute.IterativeImputer regression impute analysis value error

PROBLEM
Use interativeImputer from sklearn.impute.IterativeImputer, to get regression model fit for for BayesianRidge() for impute missing data in variable 'Frontage'.
After the interative_imputer_fit = interative_imputer.fit(data) run, the interative_imputer_fit.transform(X) runs but invoke on function, imputer_bay_ridge(data), the transform() function from interative_imputer, e.g., interative_imputer_fit.transform(X) error on value error. Passed in two variables, Frontage and Area. But only Frontage was inside the numpy.array.
Python CODE using sklearn
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.linear_model import BayesianRidge
def imputer_bay_ridge(data):
data_array = data.to_numpy()
data_array.reshape(1, -1)
interative_imputer = IterativeImputer(BayesianRidge())
interative_imputer_fit = interative_imputer.fit(data_array)
X = data['LotFrontage']
data_imputed = interative_imputer_fit.transform(X)
train_data[['Frontage', 'Area']]
INVOKE FUNCTION
fit_tranformed_imputed = imputer_bay_ridge(train_data[['Frontage', 'Area']])
DATA EXAMPLE
train_data[['Frontage', 'Area']]
Frontage Area
0 65.0 8450
1 80.0 9600
2 68.0 11250
3 60.0 9550
4 84.0 14260
... ... ...
1455 62.0 7917
1456 85.0 13175
1457 66.0 9042
1458 68.0 9717
1459 75.0 9937
1460 rows × 2 columns
ERROR
ValueError Traceback (most recent call last)
Cell In[243], line 1
----> 1 fit_tranformed_imputed = imputer_bay_ridge(train_data[['LotFrontage', 'LotArea']])
Cell In[242], line 12, in imputer_bay_ridge(data)
10 interative_imputer_fit = interative_imputer.fit(data_array)
11 X = data['LotFrontage']
---> 12 data_imputed = interative_imputer_fit.transform(X)
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/impute/_iterative.py:724, in IterativeImputer.transform(self, X)
707 """Impute all missing values in `X`.
708
709 Note that this is stochastic, and that if `random_state` is not fixed,
(...)
720 The imputed input data.
721 """
722 check_is_fitted(self)
--> 724 X, Xt, mask_missing_values, complete_mask = self._initial_imputation(X)
726 X_indicator = super()._transform_indicator(complete_mask)
728 if self.n_iter_ == 0 or np.all(mask_missing_values):
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/impute/_iterative.py:514, in IterativeImputer._initial_imputation(self, X, in_fit)
511 else:
512 force_all_finite = True
--> 514 X = self._validate_data(
515 X,
516 dtype=FLOAT_DTYPES,
517 order="F",
518 reset=in_fit,
519 force_all_finite=force_all_finite,
520 )
521 _check_inputs_dtype(X, self.missing_values)
523 X_missing_mask = _get_mask(X, self.missing_values)
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/base.py:566, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
564 raise ValueError("Validation should be done on X, y or both.")
565 elif not no_val_X and no_val_y:
--> 566 X = check_array(X, **check_params)
567 out = X
568 elif no_val_X and not no_val_y:
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/utils/validation.py:769, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
767 # If input is 1D raise error
768 if array.ndim == 1:
--> 769 raise ValueError(
770 "Expected 2D array, got 1D array instead:\narray={}.\n"
771 "Reshape your data either using array.reshape(-1, 1) if "
772 "your data has a single feature or array.reshape(1, -1) "
773 "if it contains a single sample.".format(array)
774 )
776 # make sure we actually converted to numeric:
777 if dtype_numeric and array.dtype.kind in "OUSV":
ValueError: Expected 2D array, got 1D array instead:
array=[65. 80. 68. ... 66. 68. 75.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

in matlab put data into bins and calculate mean

In matlab, say I have the following data:
data = [4 0.1; 6 0.5; 3 0.8; 2 1.4; 7 1.6; 12 1.8; 9 1.9; 1 2.3; 5 2.5; 5 2.6];
I want to place the 1st column into bins according to elements in the 2nd column (i.e. 0-1, 1-2, 2-3...), and calculate the mean and 95% confidence interval of the elements in column 1 within that bin . So I'd have a matrix something like this:
mean lower_95% upper_95% bin
4.33 0
7.5 1
3.67 2
You can use accumarray with the appropriate function for the mean (mean) or the quantiles (quantile):
m = accumarray(floor(data(:,2))+1, data(:,1), [], #mean);
l = accumarray(floor(data(:,2))+1, data(:,1), [], #(x) quantile(x,.05));
u = accumarray(floor(data(:,2))+1, data(:,1), [], #(x) quantile(x,.95));
result = [m l u (0:numel(m)-1).'];
This can also be done calling accumarray once with cell array output:
result = accumarray(floor(data(:,2))+1, data(:,1), [],...
#(x) {[mean(x) quantile(x,.05) quantile(x,.95)]});
result = cell2mat(result);
For your example data,
result =
4.3333 3.0000 6.0000 0
7.5000 2.0000 12.0000 1.0000
3.6667 1.0000 5.0000 2.0000
This outputs a matrix with the labelled columns. Note that for your example data, 2 standard deviations from the mean (for the 95% confidence interval) gives values outside of the bands. With a larger (normally distributed) data set, you wouldn't see this.
Your data:
data = [4 0.1; 6 0.5; 3 0.8; 2 1.4; 7 1.6; 12 1.8; 9 1.9; 1 2.3; 5 2.5; 5 2.6];
Binning for output table:
% Initialise output matrix. Columns:
% Mean, lower 95%, upper 95%, bin left, bin right
bins = [0 1; 1 2; 2 3];
out = zeros(size(bins,1),5);
% Cycle through bins
for ii = 1:size(bins,1)
% Store logical array of which elements fit in given bin
% You may want to include edge case for "greater than or equal to" leftmost bin.
% Alternatively you could make the left bin equal to "left bin - eps" = -eps
bin = data(:,2) > bins(ii,1) & data(:,2) <= bins(ii,2);
% Calculate mean, and mean +- 2*std deviation for confidence intervals
out(ii,1) = mean(data(bin,2));
out(ii,2) = out(ii,1) - 2*std(data(bin,2));
out(ii,3) = out(ii,1) + 2*std(data(bin,2));
end
% Append bins to the matrix
out(:,4:5) = bins;
Output:
out =
0.4667 -0.2357 1.1690 0 1.0000
1.6750 1.2315 2.1185 1.0000 2.0000
2.4667 2.1612 2.7722 2.0000 3.0000

how to create an arrays from rows using Matlab

Hi guys i need your help, so i have an array
a b c n
1 1 2 4
1 3 2 6
1 6 0 7
and i want to create another array form each rows of my array, see picture below.
I tried using this code:
assuming that my data is located at array M so,
for x=1:10
d = M(:,4)/(M(:,1) + M(:,2) + M(:,3) + x)
end
but it doesn't give my desired output
in excel you just only write the equation and drag it down, in you will have the answer but i don't know how to do it in matlab, i think we could use for loop. thanks.
PLEASE SEE THE RED BOX THAT'S MY DESIRED OUTPUT
The equivalent in Matlab would be:
data = [...
1 1 2 4;
1 3 2 6;
1 6 0 7]
x = (1:10).';
f = #(t) data(t,4)./(data(t,1) + data(t,2) + data(t,3) + x )
y = [ x f(1) x f(2) x f(3) ]
or even simpler:
N = 10;
f = #(t) [(1:N).' data(t,4)./(data(t,1) + data(t,2) + data(t,3) + (1:N).' )]
y = [ f(1) f(2) f(3) ]
the number in f(...) always indicates which row, respectively which y e.g. y1, y2, etc. you are calculating for each column of the output. The brackets [...] are concatenating the result.
Be aware that you need to use the element-wise division operator ./
Generalized for an n x m sized input array, but assuming that the n-column is always the last one of your input Matrix:
N = 10;
f = #(t) [(1:N).' data(t,end)./(sum( data(t,(1:end-1))) + (1:N).' )]
y = cell2mat(arrayfun(f, 1:size(data,1),'uni',0))
But in this case you should think about, if a more vectorized approach like Divakar's answer might be more appropriate.
result:
y =
1 0.8 1 0.85714 1 0.875
2 0.66667 2 0.75 2 0.77778
3 0.57143 3 0.66667 3 0.7
4 0.5 4 0.6 4 0.63636
5 0.44444 5 0.54545 5 0.58333
6 0.4 6 0.5 6 0.53846
7 0.36364 7 0.46154 7 0.5
8 0.33333 8 0.42857 8 0.46667
9 0.30769 9 0.4 9 0.4375
10 0.28571 10 0.375 10 0.41176
Vectorized approach to get the desired output with another good case for bsxfun to have the desired output for a generic m x n sized input array -
N = 10; %// Number of rows in the output
[m,n] = size(M) %// Get size
sum_cols = sum(M(:,1:n-1),2) %// sum along dim-2 until the second last column
sum_firstN = bsxfun(#plus,sum_cols,1:N) %// For each column-sum, add 1:N
out1 = bsxfun(#ldivide,sum_firstN,M(:,n)).'%//'# elementwise divide by last col
out = [repmat([1:N]',1,n); out1] %//'# Concatenate with starting columns of 1:N
out = reshape(out,N,[]) %// Reshape into desired shape
Code run for given 3 x 4 sized input array -
out =
1.0000 0.8000 1.0000 0.8571 1.0000 0.8750
2.0000 0.6667 2.0000 0.7500 2.0000 0.7778
3.0000 0.5714 3.0000 0.6667 3.0000 0.7000
4.0000 0.5000 4.0000 0.6000 4.0000 0.6364
5.0000 0.4444 5.0000 0.5455 5.0000 0.5833
6.0000 0.4000 6.0000 0.5000 6.0000 0.5385
7.0000 0.3636 7.0000 0.4615 7.0000 0.5000
8.0000 0.3333 8.0000 0.4286 8.0000 0.4667
9.0000 0.3077 9.0000 0.4000 9.0000 0.4375
10.0000 0.2857 10.0000 0.3750 10.0000 0.4118

Multiply one part of Cell Array with a Scalar Matlab

I have a cell array that consits of a set of tracks like this:
<TL1x3> double
<TL1x3> double
<TL3x3> double
...
where TL stands for the track length. This value is different for each ekement, but there are always three columns: time, x coord, y coord.
From the tracking algorithm I get the x and y coord in pixels. However, I need them in nm, so I have to multiply them with a value, but only the second and third, not the first column of each element, e.g.:
0 5 6 x2 0 10 12
0.5 7 2 ---> 0.5 14 4
1 8 1 1 16 2
... ...
and this for every element of the array.
With cellfun, I have managed to change every cell of the array, but I don't know how to change only one part. Do you have any idea how to do this...?
You can do this by creating an anonymous function that calls bsxfun() and passing that to cellfun(). Assuming your input data is in the cell array inputData and the scale factor to apply is in the scalar variable scaleFactor;
scaledData = cellfun(#(X) bsxfun(#times, X, [1 scaleFactor scaleFactor]), inputData, 'UniformOutput', false);
I think this gives the results you want
Given sample input:
c={[1 2 3]; [4 5 6]; [7 8 9; 10 11 12; 13 14 15]};
Then:
xf = sparse([1 0 0; 0 2 0; 0 0 2]);
d=cellfun(#(x) x * xf, c, 'uniformoutput', false);
It might not be the most elegant nor efficient way, but converting your cell array to a matrix would simplify things for you:
A = {[0 5 6] ;
[0.5 7 2];
[1 8 1 ]}
B = cell2mat(A)
B(:,2:end) = 2*B(:,2:end)
Gives this in the command window:
A =
[1x3 double]
[1x3 double]
[1x3 double]
Before:
B =
0 5.0000 6.0000
0.5000 7.0000 2.0000
1.0000 8.0000 1.0000
After:
B =
0 10.0000 12.0000
0.5000 14.0000 4.0000
1.0000 16.0000 2.0000
You could also create a temporary cell array contanining the last 2 columns of your original cell array and then apply cellfun to it and put it back in the original. Are speed/performance an issue for you?

Insert the mean of each neighbouring value, into the original vector

I have a vector, I want to 'pad' it out in MATLAB so that the resultant vector is twice the length, with the extra data being mean values of the original neighboring values.
eg.
a = [1:10]
b = function of a, where b is now size 20
b = 0.5 1 1.5 2 2.5 3 3.5....... 9.5 10
You could do this in a single line using interpolation (notice that the first digit is NaN because it really isn't defined):
interp1(2:2:length(a)*2, a, 1:length(a)*2)
The idea is to have evenly spaced x values (i.e. 2,4,6...) so that you can have single spaced xi values (i.e. 1,2,3,4...) which are thus exactly half way between each x value. Then the linear interpolation of the y points will be their means. If you don't like that NaN in the front which I left in mostly to illustrate the point that it's undefined, you can use the 'extrap' flag in interp1, or (better imo) start your xi from 2:
interp1(2:2:length(a)*2, a, 1:length(a)*2, 'linear', 'extrap')
or
interp1(2:2:length(a)*2, a, 2:length(a)*2)
Otherwise here is a simple vectorized approach:
a = 1:10;
t = [a(1:end-1);a(2:end];
t(2,:) = mean(t);
b = [t(:); a(end)]
The simplest approach is to use linspace to specify the locations at which you would like to interpolate (and extrapolate) with interp1:
>> a = 1:10;
>> b = interp1(a,linspace(0.5,numel(a),2*numel(a)),'linear','extrap')
b =
Columns 1 through 8
0.5000 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000
Columns 9 through 16
4.5000 5.0000 5.5000 6.0000 6.5000 7.0000 7.5000 8.0000
Columns 17 through 20
8.5000 9.0000 9.5000 10.0000
Using 'linear' as the method gives the average of the neighboring values, and 'extrap' says to perform extrapolation (so b(1) does not come out as NaN, but rather 0.5).
It looks like you are assuming the "zeroth" entry is zero so that you get the same number of means as the length of the original vector. You can use
a2 = filter([0.5,0.5],1,a);
to get the vector of means, where the first entry will be the mean of 0 and the first entry in a. Then, you can do whatever you like to interleave the two vectors, e.g,
b = zeros(2*max(size(a)),1);
b(1:2:end) = a2;
b(2:2:end) = a;
filter is a nifty command, especially for computing discrete convolutions on your original data vector (your neighboring means are a very simple example of a convolution). It also works on matrices either row-by-row or column-by-column.

Resources