I am having a set of data. Let's say a grid-points nxm (n latitude, m:longitude) daily temperature for the whole world during a month. However, the temperature in my location of interest is not correct, so I need to update it. In other words, I have to change the data at some certain grid points for every time step (daily). I attach here a simple example. Let's say each matrix 1x2 on the left is the correct data, while each 6x4 matrix contains some incorrect data (6: latitude, 4: longitude). What I need is to change the correct data from the left to the right as indicated in the same color for every time step.
Could anyone help me?
Many thanks
For example this data:
A=rand(4,2)
B=rand(6,4,4)
You would want these values to be replaced by A:
B(3,2:3,:)
Just make sure the size is the same
size(B(3,2:3,:))
> 1 2 4
A=reshape(A',[1 2 4])
And you can put it there
B(3,2:3,:)=A
[edit] Sorry, I probably just don't see the problem.
T = randi(255,[1E3,1E3,31],'uint8'); %1000 longitude, 1000 latitude, 31 days
C = repmat([50,100],[31,1,1]); %correction for 31 days and two locations. must become 50 and 100.
%location 20,10 and 20,11 must change.
T(20,10:11,:)=reshape(C',[1 2 31]);
T(20,10,3) %test for third day.
>> 50
T(20,11,10) %test for tenth day.
>> 100
The replacement takes 0.000365 second on my pc.
Related
I have an NxMxT array where each element of the array is a grid of Earth. If the grid is over the ocean, then the value is 999. If the grid is over land, it contains an observed value. N is longitude, M is latitude, and T is months.
In particular, I have an array called tmp60 for the ten years 1960 through 1969, so 120 months for each grid.
To test what the global mean in January 1960 was, I write:
tmpJan60=tmp60(:,:,1);
tmpJan60(tmpJan60(:,:)>200)=NaN;
nanmean(nanmean(tmpJan60))
which gives me 5.855.
I am confused about the reshape function. I thought the following code should yield the same average, namely 5.855, but it does not:
load tmp60
N1=size(tmp60,1)
N2=size(tmp60,2)
N3=size(tmp60,3)
reshtmp60 = reshape(tmp60, N1*N2,N3);
reshtmp60( reshtmp60(:,1)>200,: )=[];
mean(reshtmp60(:,1))
this gives me -1.6265, which is not correct.
I have checked the result in Excel (!) and 5.855 is correct, so I assume I make a mistake in the reshape function.
Ideally, I want a matrix that takes each grid, going first down the N-dimension, and make the 720 rows with 120 columns (each column is a month). These first 720 rows will represent one longitude band around Earth for the same latitude. Next, I want to increase the latitude by 1, thus another 720 rows with 120 columns. Ultimately I want to do this for all 360 latitudes.
If longitude and latitude were inputs, say column 1 and 2, then the matrix should look like this:
temp = [-179.75 -89.75 -1 2 ...
-179.25 -89.75 2 4 ...
...
179.75 -89.75 5 9 ...
-179.75 -89.25 2 5 ...
-179.25 -89.25 3 4 ...
...
-179.75 89.75 2 3 ...
...
179.75 89.75 6 9 ...]
So temp(:,3) should be all January 1960 observations.
One way to do this is:
grid1 = tmp60(1,1,:);
g1 = reshape(grid1, [1,120]);
grid2 = tmp60(2,1,:);
g2 = reshape(grid2,[1,120]);
g = [g1;g2];
But obviously very cumbersome.
I am not able to automate this procedure for the N*M elements, so comments are appreciated!
A link to the file tmp60.mat
The main problem in your code is treating the nans. Observe the following example:
a = randi(10,6);
a(a>7)=nan
m = [mean(a(:),'omitnan') mean(mean(a,'omitnan'),'omitnan')]
m =
3.8421 3.6806
Both elements in m are simply the mean on all elements in a. But they are different! The reason is the taking the mean of all values together, with mean(a(:),'omitnan') is like summing all not-nan values, and divide by the number of values we summed:
sum(a(:),'omitnan')/sum(~isnan(a(:)))==mean(a(:),'omitnan') % this is true
but taking the mean of the first dimension, we get 6 mean values:
sum(a,'omitnan')./sum(~isnan(a))==mean(a,'omitnan') % this is also true
and when we take the mean of them we divide by a larger number, because all nans were omitted already:
mean(sum(a,'omitnan')./sum(~isnan(a)))==mean(a(:),'omitnan') % this is false
Here is what I think you want in your code:
% this is exactly as your first test:
tmpJan60=tmn60(:,:,1);
tmpJan60(tmpJan60>200) = nan;
m1 = mean(mean(tmpJan60,'omitnan'),'omitnan')
% this creates the matrix as you want it:
result = reshape(permute(tmn60,[3 1 2]),120,[]).';
result(result>200) = nan;
r = reshape(result(:,1),720,360);
m2 = mean(mean(r,'omitnan'),'omitnan')
isequal(m1,m2)
To create the matrix you first permute the dimensions so the one you want to keep as is (time) will be the first. Then reshape the array to Tx(lon*lat), so you get 120 rows for all time steps and 259200 columns for all combinations of the coordinates. All that's left is to transpose it.
m1 is your first calculation, and m2 is what you try to do in the second one. They are equal here, but their value is not 5.855, even if I use your code.
However, I think the right solution will be to take the mean of all values together:
mean(result(:,1),'omitnan')
I have a matrix and I am trying to find where i getting a value. so, i am using find(x==y)for making vectors with the values, for example:
n11=find(x==11)
n4=find(x==4)
n8=find(x==8)
And n11, n4, n8 are not of the same length.
Sometimes, i have to do this like 20 or 30 times for 20 or 30 different values of x, so if for example i want to get an interval of x∈[1991,2015] find(x==1991) to find(x==2015) how can i get those values faster without doing
find(x==1991)
.
.
.
find(x==2015)
thank you
You can use logical indexing:
n= find(x>=1991 & x<=2015)
EDIT
meshgrid can be used to obtain a vector for each year:
x= [1991 1992 1991 2015 2016 1992 1988 1994]; % example data
[m,n]= meshgrid(x,1991:2015); % the second argument contains the years we need
n= (m==n);
Now n(1,:) is equal to x==1991, n(2,:) is equal to x==1992 etc; find(n(1,:)) equals find(x==1991) etc.
you can use a matrix for saved results. and use "for loop" for doing automaticly.
start=1991;
endi=2015;
for i=start:endi
num_column=size(find(x==i),1)
mat(i-start+1,1:num_column)=find(x==i);
end
In each row, we have result of one value ignoring zero numbers.
I'm pretty new at coding with Matlab and I'm struggling with an issue I can't fix.
Basically I have data "half - hourly taken" (48 per day) and referred to 17 days (17x48=816 elements).
I got all my data in a big matrix (816 x 31)and I need to discriminate some "day time data" from "night time data".
The elements of the column array (816 elements) I need to process are the following (for the first day):
night_data= bigmatrix([1:8,46:48],27);
day_data= bigmatrix([22:32],27)
but I have to make the same "selection" for each day, i.e. the next day would be
night_data_2 = bigmatrix ([49:56,93:96],27)
day_data_2 = bigmatrix ([70:81],27)
and so on...
How can I make it? Should I use a loop? Is there any indexing function I don't know that could help me?
Thank you in advance.
You can reshape your data so that each column represents one day. That would give you a 48 x 17 x 31 matrix:
dailymatrix = reshape(bigmatrix, 48, 17, 31);
Now, to access the data you've got one new subscript. Your first night/day data would change to
night_data = dailymatrix([1:8, 46:48], 1, 27);
% ^-- 1st day
day_data = dailymatrix([22:32], 1, 27);
The second day's data would be:
night_data = dailymatrix([1:8, 46:48], 2, 27);
% ^-- 2nd day
day_data = dailymatrix([22:32], 2, 27);
To get all 17 days' worth of data,
night_data = dailymatrix([1:8, 46:48], :, 27);
day_data = dailymatrix([22:32], :, 27);
Since the data is in the same timeslots each day, you never have to change the first subscript.
You can use variables in your indicies for your matrix and wrap this in a loop with some dynamic indexing.
night_data.(strcat('night',int2str(n)))=bigmatrix([1+n*48:8+n*48, 46+n*48:48+n*48],27)
This will create a structure that creates fields called night 1, night 2 etc all the way to night n that you need. This can be repeated for day as well.
However, you should be using date indexing with table variables in matlab. Once you convert your date column to a datetime object,
bigmatrix.Date=datetime(bigmatrix.Date)
you can basically do something like the following.
night_data_1=bigmatrix(hour(bigmattrix.Date)>22&hour(bigmattrix.Date)<8 ,27)
which will be able to index all data points between 10 PM and 8 AM (or whatever your day-night cycle cutoff is).
There is an integer array, for eg.
{3,1,2,7,5,6}
One can move forward through the array either each element at a time or can jump a few elements based on the value at that index. For e.g., one can go from 3 to 1 or 3 to 7, then one can go from 1 to 2 or 1 to 2(no jumping possible here), then one can go 2 to 7 or 2 to 5, then one can go 7 to 5 only coz index of 7 is 3 and adding 7 to 3 = 10 and there is no tenth element.
I have to only count the number of possible paths to reach the end of the array from start index.
I could only do it recursively and naively which runs in exponential time.
Somebody plz help.
My recommendation: use dynamic programming.
If this key word is sufficient and you want the challenge to find a possible solution on your own, dont read any further!
Here a possible DP-algorithm on the example input {3,1,2,7,5,6}. It will be your job to adjust on the general problem.
create array sol length 6 with just zeros in it. the array will hold the number of ways.
sol[5] = 1;
for (i = 4; i>=0;i--) {
sol[i] = sol[i+1];
if (i+input[i] < 6 && input[i] != 1)
sol[i] += sol[i+input[i]];
}
return sol[0];
runtime O(n)
As for the directed graph solution hinted in the comments :
Each cell in the array represents a node. Make an directed edge from each node to the node accessable. Basically you can then count more easily the number of ways by just looking at the outdegrees on the nodes (since there is no directed cycle) however it is a lot of boiler plate to actual program it.
Adjusting the recursive solution
another solution would be to pruning. This is basically equivalent to the DP-algorithm. The exponentiel time comes from the fact, that you calculate values several times. Eg function is recfunc(index). The initial call recFunc(0) calls recFunc(1) and recFunc(3) and so on. However recFunc(3) is bound to be called somewhen again, which leads to a repeated recursive calculation. To prune this you add a Map to hold all already calculated values. If you make a call recFunc(x) you lookup in the map if x was already calculated. If yes, return the stored value. If not, calculate, store and return it. This way you get a O(n) too.
I'm trying to create a script that reads data from a text file, and plots the data onto a scatter plot.
For example, say the file name is prices.txt and contains:
Pens 2 4
Pencils 1.5 3
Rulers 3 3.5
Sharpeners 1 3
Highlighters 3 4
Where columns 2 and 3 are prices of the items for two different stores.
What my script should do is read the prices, calculates (using another function) future prices of the stores and plots these prices onto a scatter plot where x is one store and y is another. This is a silly example I know but it fits the description.
Don't worry to much about the other function that does the calculation, just assume it does what its supposed to.
Basically, I've come up with the following:
pricesfile = fopen('Prices.txt');
prices = textscan(pricesfile, '%s %d d');
fclose(pricesfile);
count = 1;
while count <= length(prices{1})
for item = constants{1}
name = constants{1}{count};
store_A = prices{2}{count};
store_B = prices{3}{count};
(...other function goes here...)
end
end
After doing this I'm completely stuck. My thought process behind this was to go through each item name, and create a vector that's assigned to this name with its two corresponding prices as items in the vector eg:
pens = [2 4]
pencils = [1.5 3]
etc. Then, I would somehow plot those items in the vector on a scatter plot and use the name of the vector as a label.
I'm not too sure how to carry out the rest of my code or even if what I've written will get me to the solution.
Please help and thanks in advance.
pricesfile = fopen('Prices.txt');
data = textscan(pricesfile, '%s %d d');
fclose(pricesfile);
You were on the right track but after this (through a bit of hackery) you don't actually need a loop:
plot(repmat(data{2},1,2)', repmat(data{3},1,2)', '.')
legend(data{1})
What you DO NOT want to do is create variables named after strings. Rather store them in an array with an array of the names (which is basically what your textscan code gives you). Matlab is very good at handling matrices/arrays.
You could also split your price array up for example:
names = prices{1};
prices = [data{2:3}];
now you can perform calculations on prices quite easily like
prices_cents = prices*100;
plot(prices_cents(:,[1,1]), prices_cents(:,[2,2]))
legend(names)
Note that the [1,1] etc above is just using indexing as a short hand to achieve what repmat does...