Fill gaps of matrix that has irregular steps - arrays

I am trying to perform a calculation using two different matrices, but they have come in slightly different forms.
The one matrix (for interests sake) are filled with reflectance values of a material from wavelengths 200nm to 2600nm, so each individual wavelength, in increments of 1 has a reflectance value.
The second matrix is a solar energy matrix which stores the amount of energy that is present at each wavelength. This one however has irregular steps and ranges from 280nm to 4000nm. But from 280nm-400nm it is in steps of 0.5nm, from 400nm-1705nm it is in steps of 1nm, and from 1750nm-4000nm it is steps of 5nm.
What I have been trying to do, unsucessfully thus far, is to edit this solar energy matrix so that it gives the entire range in steps of 1nm.
filename='H:\I_sol data.csv';
Dataisol = csvread(filename,1,0);
for j=1:1:count
if Dataisol(j,:)~=Dataisol(j+1,:)-1 %compare the wavelength to the value of the next wavelegth
newx=(Dataisol(j,:)+[1,0]) %if the next wavelength is not 1 larger than the previous, add a new row
newx(1,2)=NaN %make the new row to add blank
Dataisol=insertrows(Dataisol, newrow, j+1) %insert the new blank row
end
end
Above is what I have started with, at the moment I am just trying to fill the gaps by adding in new rows where there is a 5nm jump between wavelengths. Once i am able to create the missing elements, then I will turn my attention to populating them with the correct values (probably the midpoint between the 2 given values)
My end goal is going to be to trim both of the matrices so that they both have the same starting and ending wavelength and both have increments of 1nm throughout (also for interest sake, or for advice if this is trivial for someone). If anyone knows how to fill these gaps or make the necessary changes to the matrix it would be a great help!
Example of the csv file:
Wvlgth nm Etr W*m-2*nm-1
280.0 8.2000E-02
280.5 9.9000E-02
281.0 1.5000E-01
281.5 2.1200E-01
282.0 2.6700E-01
282.5 3.0300E-01
283.0 3.2500E-01
283.5 3.2300E-01
284.0 2.9900E-01
284.5 2.5024E-01
285.0 1.7589E-01
285.5 1.5500E-01
286.0 2.4200E-01
... .....
428.0 1.6510E+00
429.0 1.5230E+00
430.0 1.2120E+00
431.0 1.0990E+00
432.0 1.8220E+00
433.0 1.6913E+00
434.0 1.5600E+00
435.0 1.7090E+00
436.0 1.8680E+00
437.0 1.9000E+00
438.0 1.6630E+00
439.0 1.6010E+00
440.0 1.8300E+00
.... .....
2205.0 8.0900E-02
2210.0 8.0810E-02
2215.0 8.0410E-02
2220.0 7.9990E-02
2225.0 7.8840E-02
2230.0 7.8400E-02
2235.0 7.7930E-02
2240.0 7.6510E-02
2245.0 7.6250E-02
2250.0 7.5370E-02
... .....
Here is the code I use for assigning the variables to be used in the interp1 function, which is called as follows:
solx=Dataisol(:,1);
soly=Dataisol(:,2);
xi=280:1:2600;
newsol = [xi interp1(solx,soly,xi,'linear','extrap')];
The values that are stored in these variables as well as the error I am receiving are given below:

The function you need here is interp1. Set xi to be a vector of all the wavelengths you want to consider, say xi=280:1:2600;.
if wavelength is a vector of all your irregular values from the file, and sol is the corresponding vector of all the solar energies (you can use column references for your single matrix here as well)
newsol = [xi interp1(wavelength,sol,xi,'linear','extrap')];
will give you a new matrix with wavelengths increasing by 1 in column 1, and column 2 will contain values directly from your file where they exist and linearly interpolated values where they do not.

Related

How to compute an additional variable data array from the values of other variables

I have a oceanic weather dataset over three dimension (time, x, y) which includes two data arrays with two different variables (Hs, Te)
I want to compute a third data array (power generated) based on the values of the other two data arrays, but I cannot do it arithmetically. I have a 2D pandas dataframe (power matrix) which gives the power output depending on the combination of Hs and Te.
[enter image description here][1]
I have tried to input the Hs and Te data arrays using loc and iloc in but it does not take multidimensional indexes.
Is there a way I can use Hs, and Te, for every given timestep and coordinates, to calculate the power output from the matrix and assign it to the third data array?
I will really appreciate it if someone can help me!
Here is a bit of what i have tried.
def get_gen(ds, power_matrix):
Hs= ds['wave_height']
Te= ds['wave_period']
power_mat =power_matrix
power = power_mat.loc[[Hs], [Te]]
return power
test = get_gen(ds, power_matrix)```
[1]: https://i.stack.imgur.com/sp5FU.png

Randomize matrix elements between two values while keeping row and column sums fixed (MATLAB)

I have a bit of a technical issue, but I feel like it should be possible with MATLAB's powerful toolset.
What I have is a random n by n matrix of 0's and w's, say generated with
A=w*(rand(n,n)<p);
A typical value of w would be 3000, but that should not matter too much.
Now, this matrix has two important quantities, the vectors
c = sum(A,1);
r = sum(A,2)';
These are two row vectors, the first denotes the sum of each column and the second the sum of each row.
What I want to do next is randomize each value of w, for example between 0.5 and 2. This I would do as
rand_M = (0.5-2).*rand(n,n) + 0.5
A_rand = rand_M.*A;
However, I don't want to just pick these random numbers: I want them to be such that for every column and row, the sums are still equal to the elements of c and r. So to clean up the notation a bit, say we define
A_rand_c = sum(A_rand,1);
A_rand_r = sum(A_rand,2)';
I want that for all j = 1:n, A_rand_c(j) = c(j) and A_rand_r(j) = r(j).
What I'm looking for is a way to redraw the elements of rand_M in a sort of algorithmic fashion I suppose, so that these demands are finally satisfied.
Now of course, unless I have infinite amounts of time this might not really happen. I therefore accept these quantities to fall into a specific range: A_rand_c(j) has to be an element of [(1-e)*c(j),(1+e)*c(j)] and A_rand_r(j) of [(1-e)*r(j),(1+e)*r(j)]. This e I define beforehand, say like 0.001 or something.
Would anyone be able to help me in the process of finding a way to do this? I've tried an approach where I just randomly repick the numbers, but this really isn't getting me anywhere. It does not have to be crazy efficient either, I just need it to work in finite time for networks of size, say, n = 50.
To be clear, the final output is the matrix A_rand that satisfies these constraints.
Edit:
Alright, so after thinking a bit I suppose it might be doable with some while statement, that goes through every element of the matrix. The difficult part is that there are four possibilities: if you are in a specific element A_rand(i,j), it could be that A_rand_c(j) and A_rand_r(i) are both too small, both too large, or opposite. The first two cases are good, because then you can just redraw the random number until it is smaller than the current value and improve the situation. But the other two cases are problematic, as you will improve one situation but not the other. I guess it would have to look at which criteria is less satisfied, so that it tries to fix the one that is worse. But this is not trivial I would say..
You can take advantage of the fact that rows/columns with a single non-zero entry in A automatically give you results for that same entry in A_rand. If A(2,5) = w and it is the only non-zero entry in its column, then A_rand(2,5) = w as well. What else could it be?
You can alternate between finding these single-entry rows/cols, and assigning random numbers to entries where the value doesn't matter.
Here's a skeleton for the process:
A_rand=zeros(size(A)) is the matrix you are going to fill
entries_left = A>0 is a binary matrix showing which entries in A_rand you still need to fill
col_totals=sum(A,1) is the amount you still need to add in every column of A_rand
row_totals=sum(A,2) is the amount you still need to add in every row of A_rand
while sum( entries_left(:) ) > 0
% STEP 1:
% function to fill entries in A_rand if entries_left has rows/cols with one nonzero entry
% you will need to keep looping over this function until nothing changes
% update() A_rand, entries_left, row_totals, col_totals every time you loop
% STEP 2:
% let (i,j) be the indeces of the next non-zero entry in entries_left
% assign a random number to A_rand(i,j) <= col_totals(j) and <= row_totals(i)
% update() A_rand, entries_left, row_totals, col_totals
end
update()
A_rand(i,j) = random_value;
entries_left(i,j) = 0;
col_totals(j) = col_totals(j) - random_value;
row_totals(i) = row_totals(i) - random_value;
end
Picking the range for random_value might be a little tricky. The best I can think of is to draw it from a relatively narrow distribution centered around N*w*p where p is the probability of an entry in A being nonzero (this would be the average value of row/column totals).
This doesn't scale well to large matrices as it will grow with n^2 complexity. I tested it for a 200 by 200 matrix and it worked in about 20 seconds.

Split array into smaller unequal-sized arrays dependend on array-column values

I'm quite new to MatLab and this problem really drives me insane:
I have a huge array of 2 column and about 31,000 rows. One of the two columns depicts a spatial coordinate on a grid the other one a dependent parameter. What I want to do is the following:
I. I need to split the array into smaller parts defined by the spatial column; let's say the spatial coordinate are ranging from 0 to 500 - I now want arrays that give me the two column values for spatial coordinate 0-10, then 10-20 and so on. This would result in 50 arrays of unequal size that cover a spatial range from 0 to 500.
II. Secondly, I would need to calculate the average values of the resulting columns of every single array so that I obtain per array one 2-dimensional point.
III. Thirdly, I could plot these points and I would be super happy.
Sadly, I'm super confused since I miserably fail at step I. - Maybe there is even an easier way than to split the giant array in so many small arrays - who knows..
I would be really really happy for any suggestion.
Thank you,
Arne
First of all, since you wish a data structure of array of different size you will need to place them in a cell array so you could try something like this:
res = arrayfun(#(x)arr(arr(:,1)==x,:), unique(arr(:,1)), 'UniformOutput', 0);
The previous code return a cell array with the array splitted according its first column with #(x)arr(arr(:,1)==x,:) you are doing a function on x and arrayfun(function, ..., 'UniformOutput', 0) applies function to each element in the following arguments (taken a single value of each argument to evaluate the function) but you must notice that arr must be numeric so if not you should map your values to numeric values or use another way to select this values.
In the same way you could do
uo = 'UniformOutput';
res = arrayfun(#(x){arr(arr(:,1)==x,:), mean(arr(arr(:,1)==x,2))), unique(arr(:,1)), uo, 0);
You will probably want to flat the returning value, check the function cat, you could do:
res = cat(1,res{:})
Plot your data depends on their format, so I can't help if i don't know how the data are, but you could try to plot inside a loop over your 'res' variable or something similar.
Step I indeed comes with some difficulties. Once these are solved, I guess steps II and III can easily be solved. Let me make some suggestions for step I:
You first define the maximum value (maxValue = 500;) and the step size (stepSize = 10;). Now it is possible to iterate through all steps and create your new vectors.
for k=1:maxValue/stepSize
...
end
As every resulting array will have different dimensions, I suggest you save the vectors in a cell array:
Y = cell(maxValue/stepSize,1);
Use the find function to find the rows of the entries for each matrix. At each step k, the range of values of interest will be (k-1)*stepSize to k*stepSize.
row = find( (k-1)*stepSize <= X(:,1) & X(:,1) < k*stepSize );
You can now create the matrix for a stepk by
Y{k,1} = X(row,:);
Putting everything together you should be able to create the cell array Y containing your matrices and continue with the other tasks. You could also save the average of each value range in a second column of the cell array Y:
Y{k,2} = mean( Y{k,1}(:,2) );
I hope this helps you with your task. Note that these are only suggestions and there may be different (maybe more appropriate) ways to handle this.

R fill matrix or array with conditional lagged calculation in for loop

I've dug through the list archive, and either I don't know the right words to ask this question or this hasn't come up before--
I have a simulation function where I track a list of points over time, and want to introduce an extra lagged calculation based on an assignment. I've created a very simple bit of code to understand how R fills in a matrix:
t<-21 #time step
N<-10 #points to track
#creating a matrix where it's easy for me to see how the calculation is done
NEE<-rep(NA, (t+1)*N);dim(NEE)<-c(N,(t+1))
for(i in 1:t){
NEE[,1]<-1
NEE[,i+1]<-NEE[,i]+5
}
#the thing to calculate
gt<-rep(0, (t+1)*N);dim(gt)<-c(N,(t+1))
#assigned states
veg<-c(rep(0,5), rep(1,5))
veg.com<-rep(veg, t);dim(veg.com)<-c(N,t)
for (i in 1:t){
gt[,i+1]<-ifelse(veg.com[,i]==0, NEE[,i]/5, NEE[,i-3]/5)
}
#to have a view of what happens
veg1<-gt[1,]*5 #assignment for veg.com==0
veg2<-gt[10,]*5 #assignment for veg.com==1
what<-cbind(NEE[1,], veg1,veg2)
what
Of course it works, except how it fills in the first bit (shown here as the first 4 values in veg2 of what) before the lag is in effect when veg.com==1. I'm sure there're work arounds, but I first simply want to understand what R is doing in those initial few loops?
The first two times through that second for-loop you will be using negative indexing with the expression
NEE[ , i-3]
That will return a 10 column matrix with removal of the 2nd column. The next iteration will return another 10 column matrix with removal of the first column. Negative indices remove portions of a matrix or dataframe in R

Splitting an array into n parts and then joining them again forming a histogram

I am new to Matlab.
Lets say I have an array a = [1:1:1000]
I have to divide this into 50 parts 1-20; 21-40 .... 981-1000.
I am trying to do it this way.
E=1000X
a=[1:E]
n=50
d=E/n
b=[]
for i=0:n
b(i)=a[i:d]
end
But I am unable to get the result.
And the second part I am working on is, depending on another result, say if my answer is 3, the first split array should have a counter and that should be +1, if the answer is 45 the 3rd split array's counter should be +1 and so on and in the end I have to make a histogram of all the counters.
You can do all of this with one function: histc. In your situation:
X = (1:1:1000)';
Edges = (1:20:1000)';
Count = histc(X, Edges);
Essentially, Count contains the number of elements in X that fall into the categories defined in Edges, where Edges is a monotonically increasing vector whose elements define the boundaries of sequential categories. A more common example might be to construct X using a probability density, say, the uniform distribution, eg:
X = 1000 * rand(1000, 1);
Play around with specifications for X and Edges and you should get the idea. If you want the actual histogram plot, look into the hist function.
As for the second part of your question, I'm not really sure what you're asking.

Resources