how to get more than one number inside of matrices - arrays

Ok. I have a simple question although I'm still fairly new to Matlab (taught myself). So I was wanting a 1x6 matrix to look like this below:
0
0
1
0
321, 12 <--- needs to be in one box in 1x6 matrices
4,30,17,19 <--- needs to be in one box in 1x6 matrices
Is there a possible way to do this or am I going to just have to write them all in separate boxes thus making it a 1x10 matrix?
My code:
event_marker = 0;
event_count = 0;
block_number = 1;
date = [321,12] % (its corresponding variables = 321 and 12)
time = [4,30,17,19] % (its corresponding variable = 4 and 30 and 17 and 19)

So if I understand you correctly, you want an array that contains 6 elements, of which 1 element equals 1, another element is the array [312,12] and the last element is the array [4,30,17,19].
I'll suggest two things to accomplish this: matrices, and cell-arrays.
Cell arrays
In Matlab, a cell array is a container for arbitrary types of data. You define it using curly-braces (as opposed to block braces for matrices). So, for example,
C = {'test', rand(4), {#cos,#sin}}
is something that contains a string (C{1}), a normal matrix (C{2}), and another cell which contains function handles (C{3}).
For your case, you can do this:
C = {0,0,1,0, [321,12], [4,30,17,19]};
or of course,
C = {0, event_marker, event_count, block_number, date, time};
Matrices
Depending on where you use it, a normal matrix might suffice as well:
M = [0 0 0 0
event_marker 0 0 0
event_count 0 0 0
block_number 0 0 0
321 12 0 0
4 30 17 19];
Note that you'll need some padding (meaning, you'll have to add those zeros in the top-right somehow). There's tonnes of ways to do that, but I'll "leave that as an exercise" :)
Again, it all depends on the context which one will be easier.

Consider using cell arrays rather than matrices for your task.
data = cell(6,1); % allocate cell
data{1} = event_marker; % note the curly braces here!
...
data{6} = date; % all elements of date fits into a single cell.

If your date and time variables are actually represent date (numbers of days, months, years) and time (hours, mins, sec), they can be packed into one or two numbers.
Look into DATENUM function. If you have a vector, for example, [2013, 4, 10], representing April 10th of 2013 you can convert it into a serial date:
daten = datenum([2013, 4, 10]);
It's ok if you have number of days in a year, but not months. datenum([2013, 0, 300]) will also work.
The time can be packed together with date or separately:
timen = datenum([0, 0, 0, 4, 30, 17.19]);
or
datetimen = datenum([2013, 4, 10, 4, 30, 17.19]);
Once you have this serial date you can just keep it in one vector with other numbers.
You can convert this number back into either date vector or date string with DATEVEC and DATESTR function.

Related

Reshape a 3D array and remove missing values

I have an NxMxT array where each element of the array is a grid of Earth. If the grid is over the ocean, then the value is 999. If the grid is over land, it contains an observed value. N is longitude, M is latitude, and T is months.
In particular, I have an array called tmp60 for the ten years 1960 through 1969, so 120 months for each grid.
To test what the global mean in January 1960 was, I write:
tmpJan60=tmp60(:,:,1);
tmpJan60(tmpJan60(:,:)>200)=NaN;
nanmean(nanmean(tmpJan60))
which gives me 5.855.
I am confused about the reshape function. I thought the following code should yield the same average, namely 5.855, but it does not:
load tmp60
N1=size(tmp60,1)
N2=size(tmp60,2)
N3=size(tmp60,3)
reshtmp60 = reshape(tmp60, N1*N2,N3);
reshtmp60( reshtmp60(:,1)>200,: )=[];
mean(reshtmp60(:,1))
this gives me -1.6265, which is not correct.
I have checked the result in Excel (!) and 5.855 is correct, so I assume I make a mistake in the reshape function.
Ideally, I want a matrix that takes each grid, going first down the N-dimension, and make the 720 rows with 120 columns (each column is a month). These first 720 rows will represent one longitude band around Earth for the same latitude. Next, I want to increase the latitude by 1, thus another 720 rows with 120 columns. Ultimately I want to do this for all 360 latitudes.
If longitude and latitude were inputs, say column 1 and 2, then the matrix should look like this:
temp = [-179.75 -89.75 -1 2 ...
-179.25 -89.75 2 4 ...
...
179.75 -89.75 5 9 ...
-179.75 -89.25 2 5 ...
-179.25 -89.25 3 4 ...
...
-179.75 89.75 2 3 ...
...
179.75 89.75 6 9 ...]
So temp(:,3) should be all January 1960 observations.
One way to do this is:
grid1 = tmp60(1,1,:);
g1 = reshape(grid1, [1,120]);
grid2 = tmp60(2,1,:);
g2 = reshape(grid2,[1,120]);
g = [g1;g2];
But obviously very cumbersome.
I am not able to automate this procedure for the N*M elements, so comments are appreciated!
A link to the file tmp60.mat
The main problem in your code is treating the nans. Observe the following example:
a = randi(10,6);
a(a>7)=nan
m = [mean(a(:),'omitnan') mean(mean(a,'omitnan'),'omitnan')]
m =
3.8421 3.6806
Both elements in m are simply the mean on all elements in a. But they are different! The reason is the taking the mean of all values together, with mean(a(:),'omitnan') is like summing all not-nan values, and divide by the number of values we summed:
sum(a(:),'omitnan')/sum(~isnan(a(:)))==mean(a(:),'omitnan') % this is true
but taking the mean of the first dimension, we get 6 mean values:
sum(a,'omitnan')./sum(~isnan(a))==mean(a,'omitnan') % this is also true
and when we take the mean of them we divide by a larger number, because all nans were omitted already:
mean(sum(a,'omitnan')./sum(~isnan(a)))==mean(a(:),'omitnan') % this is false
Here is what I think you want in your code:
% this is exactly as your first test:
tmpJan60=tmn60(:,:,1);
tmpJan60(tmpJan60>200) = nan;
m1 = mean(mean(tmpJan60,'omitnan'),'omitnan')
% this creates the matrix as you want it:
result = reshape(permute(tmn60,[3 1 2]),120,[]).';
result(result>200) = nan;
r = reshape(result(:,1),720,360);
m2 = mean(mean(r,'omitnan'),'omitnan')
isequal(m1,m2)
To create the matrix you first permute the dimensions so the one you want to keep as is (time) will be the first. Then reshape the array to Tx(lon*lat), so you get 120 rows for all time steps and 259200 columns for all combinations of the coordinates. All that's left is to transpose it.
m1 is your first calculation, and m2 is what you try to do in the second one. They are equal here, but their value is not 5.855, even if I use your code.
However, I think the right solution will be to take the mean of all values together:
mean(result(:,1),'omitnan')

Max of an Array (SAS)

I have an array of auc values, cv_auc0-cv_auc39, numbered 0-39. The maximum auc value is .7778, and it appears in several places in the array (33, 35, 38, 39). When I create the variable
auc_max = max(of cv_auc0-cv_auc&39);
It seems to identify place 39 as the maximum, even though this maximum appears elsewhere in the array.
These numbers 0-39 reflect the number of covariates in a model, and I want to keep this number as low as possible while maintaining max auc, thus I would like for the auc_max variable to identify place 33 instead of 39. How to do this?
I extract this covariate number, p, in the following code:
array a (*) cv_auc0-cv_auc&maxp;
do k = &maxp to 0 by -1;
if (a(k+1) = auc_max) then p = k;
end;
cross_val_auc = a(p+1);
keep p cross_val_auc;
And the p it returns is 39 instead of 33.
Why not just use the WHICHN() function? You might want to subtract one since your variable name suffixes start from zero instead of one.
auc_max = max(of cv_auc0-cv_auc&maxp);
p = whichn(auc_max,of cv_auc0-cv_auc&maxp)-1;
I don't see anything in here that could be incorrect. Best guess is that the max value is slightly different between the places. If the value in place 39 is, say, 1e-6 > the value in place 33, then you will return place 39.
Here is how I would do it. I would iterate up from the bottom and use the leave; statement to stop the loop.
data test;
array a[10] (1 2 3 4 4 3 2 4 1 4);
m = max(of a1-a10);
do p=1 to 10 ;
if a[p] = m then leave;
end;
put m= p=;
run;
returns:
m=4 p=4

matlab: how to speed up the count of consecutive values in a cell array

I have the 137x19 cell array Location(1,4).loc and I want to find the number of times that horizontal consecutive values are present in Location(1,4).loc. I have used this code:
x=Location(1,4).loc;
y={x(:,1),x(:,2)};
for ii=1:137
cnt(ii,1)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3)};
for ii=1:137
cnt(ii,2)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3),x(:,4)};
for ii=1:137
cnt(ii,3)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1})&strcmp(x(:,4),y{1,4}{ii,1}));
end
y={x(:,1),x(:,2),x(:,3),x(:,4),x(:,5)};
for ii=1:137
cnt(ii,4)=sum(strcmp(x(:,1),y{1,1}{ii,1})&strcmp(x(:,2),y{1,2}{ii,1})&strcmp(x(:,3),y{1,3}{ii,1})&strcmp(x(:,4),y{1,4}{ii,1})&strcmp(x(:,5),y{1,5}{ii,1}));
end
... continue for all the columns. This code run and gives me the correct result but it's not automated and it's slow. Can you give me ideas to automate and speed up the code?
I think I will write an answer to this since I've not done so for a while.
First convert your cell Array to a matrix,this will ease the following steps by a lot. Then diff is the way to go
A = randi(5,[137,19]);
DiffA = diff(A')'; %// Diff creates a matrix that is 136 by 19, where each consecutive value is subtracted by its previous value.
So a 0 in DiffA would represent 2 consecutive numbers in A are equal, 2 consecutive 0s would mean 3 consecutive numbers in A are equal.
idx = DiffA==0;
cnt(:,1) = sum(idx,2);
To do 3 consecutive number counts, you could do something like:
idx2 = abs(DiffA(:,1:end-1))+abs(DiffA(:,2:end)) == 0;
cnt(:,2) = sum(idx2,2);
Or use another Diff, the abs is used to avoid negative number + positive number that also happens to give 0; otherwise only 0 + 0 will give you a 0; you can now continue this pattern by doing:
idx3 = abs(DiffA(:,1:end-2))+abs(DiffA(:,2:end-1))+abs(DiffA(:,3:end)) == 0
cnt(:,3) = sum(idx3,2);
In loop format:
absDiffA = abs(DiffA)
for ii = 1:W
absDiffA = abs(absDiffA(:,1:end-1) + absDiffA(:,1+1:end));
idx = (absDiffA == 0);
cnt(:,ii) = sum(idx,2);
end
NOTE: this method counts [0,0,0] twice when evaluating 2 consecutives, and once when evaluating 3 consecutives.

Filling an array where one portion is linearly increasing and the rest is truncated

I'm trying to fill an array of size 1 x 200 with values. I want the array to be filled with values ranging from 0 to 216 in steps of 6 and then keep the value constant (216) for the remaining part of the array.
How can I do that?
One way is to initially create an array from 0 to 216 in steps of 6, then concatenate the array of 216s until you reach 200 values.
Something like:
out = 0:6:216;
N = 200;
out(end+1:end+N-numel(out)) = 216;
Another way is to create 200 values of 216, then fill replace the values of the array from 1 up to 216/6 = 36 and add 1 since we're including 0; fill this in with the desired array:
N = 200; stop = (N/6) + 1;
out = 216*ones(1,N);
out(1:stop) = 0:6:216;
Finally, another way is to create an array from 0 up to 200, truncate all values that are greater than 36 to be 36, then multiply the result by 6:
N = 200;
out = 0:N;
out(out > 36) = 36;
out = 6*out;
... and as for completeness, you can do this with min1:
out = min(0:199,36)*6;
The two argument min call outputs the minimum of the first and second input for each element between two arrays of compatible sizes. Should any of the inputs be constants, then this constant is compared with against all elements in the array. The explanation for this code is to generate an array from 0 to 199, then any values that are less than 36 we keep, but any values greater stay at 36. We then multiply the result by 6 to obtain the result.
1: Credit for this answer goes to user Stewie Griffin before he deleted his answer. I decided to put this in for completeness.
arr = min(0:6:(6*199),216);
should work
or:
arr = min((0:199)*6,216);

Matlab: Help in implementing quantized time series

I am having trouble implementing this code due to the variable s_k being logical 0/1. In what way can I implement this statement?
s_k is a random sequence of 0/1 generated using a rand() and quantizing the output of rand() by its mean given below. After this, I don't know how to implement. Please help.
N =1000;
input = randn(N);
s = (input>=0.5); %converting into logical 0/1;
UPDATE
N = 3;
tmax = 5;
y(1) = 0.1;
for i =1 : tmax+N-1 %// Change here
y(i+1) = 4*y(i)*(1-y(i)); %nonlinear model for generating the input to Autoregressive model
end
s = (y>=0.5);
ind = bsxfun(#plus, (0:tmax), (0:N-1).');
x = sum(s(ind+1).*(2.^(-ind+N+1))); % The output of this conversion should be real numbers
% Autoregressive model of order 1
z(1) =0;
for j =2 : N
z(j) = 0.195 *z(j-1) + x(j);
end
You've generated the random logical sequence, which is great. You also need to know N, which is the total number of points to collect at one time, as well as a list of time values t. Because this is a discrete summation, I'm going to assume the values of t are discrete. What you need to do first is generate a sliding window matrix. Each column of this matrix represents a set of time values for each value of t for the output. This can easily be achieved with bsxfun. Assuming a maximum time of tmax, a starting time of 0 and a neighbourhood size N (like in your equation), we can do:
ind = bsxfun(#plus, (0:tmax), (0:N-1).');
For example, assuming tmax = 5 and N = 3, we get:
ind =
0 1 2 3 4 5
1 2 3 4 5 6
2 3 4 5 6 7
Each column represents a time that we want to calculate the output at and every row in a column shows a list of time values we want to calculate for the desired output.
Finally, to calculate the output x, you simply take your s_k vector, make it a column vector, use ind to access into it, do a point-by-point multiplication with 2^(-k+N+1) by substituting k with what we got from ind, and sum along the rows. So:
s = rand(max(ind(:))+1, 1) >= 0.5;
x = sum(s(ind+1).*(2.^(-ind+N+1)));
The first statement generates a random vector that is as long as the maximum time value that we have. Once we have this, we use ind to index into this random vector so that we can generate a sliding window of logical values. We need to offset this by 1 as MATLAB starts indexing at 1.

Resources