Matlab- Create cell confusion matrix - arrays

I have the following cell matrix, which will be used as a confusion matrix:
confusion=cell(25,25);
Then, I have two other cell arrays, on which each line contains predicted labels (array output) and another cell matrix containing the real labels (array groundtruth).
whos output
Name Size Bytes Class Attributes
output 702250x1 80943902 cell
whos groundtruth
Name Size Bytes Class Attributes
groundtruth 702250x1 84270000 cell
Then, I created the following script to create the confusion matrix
function confusion=write_confusion_matrix(predict, groundtruth)
confusion=cell(25,25);
for i=1:size(predict,1)
confusion{groundtruth{i},predict{i}}=confusion{groundtruth{i}, predict{i}}+1;
end
end
But when I run it in matlab I have the following error:
Index exceeds matrix dimensions.
Error in write_confusion_matrix (line 4)
confusion{groundtruth{i},predict{i}}=confusion{groundtruth{i}, predict{i}}+1;
I was curious to print output's and groundtruth's values to see what was happening
output{1}
ans =
2
groundtruth{1}
ans =
1
So, nothing seems to be wrong with the values, so what is wrong here? is the confusion matrix's indexing right in the code?

The error occurs in a for loop. Checking the first iteration of the loop is not sufficient in this case. Index exceeds matrix dimensions means there exists an i in the range of 1:size(output,1) for which either groundtruth{i} or output{i} is greater than 25.
You can find out which one has at least one element bigger than the range:
% 0 means no, there is none above 25. 1 means yes, there exists at least one:
hasoutlier = any(cellfun(#(x) x > 25, groundtruth)) % similar for 'output'
Or you can count them:
outliercount = sum(cellfun(#(x) x > 25, groundtruth))
Maybe you also want to find these elements:
outlierindex = find(cellfun(#(x) x > 25, groundtruth))
By the way, I am wondering why are you working with cell arrays in this case? Why not numeric arrays?

Related

Create list of random numbers between x and y using formula in Google Sheets

I'm trying to create a list of 50 random numbers let's say between 100 and 500 with one formula in Gsheets. Is there any formula like 'apply this to x cells'?
What I tried so far is (and doesn't work). I hoped randarray function will 'force' randbetween function to create 2D array (randarray creates a list of numbers between 0 and 1).
={
RANDARRAY(50,1), ARRAY_CONSTRAIN(RANDBETWEEN(100,500),50,1)
}
Error
Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 50. Actual: 1.
So this error indicates that array_constrain didn't help either.
try like this:
=ARRAYFORMULA(RANDBETWEEN(ROW(A100:A149), 500))
In generic terms, if you need N random numbers, between X and Y, you would combine the following formulas:
RandBetween(X, Y)
Row(cell_ref)
Indirect(string_cell_ref)
ArrayFormula(array_formula)
Details
When combining a Row(cell_ref) with an ArrayFormula, you can specify a cell range or simply a number range:
ArrayFormula(Row(1:50))
The above example generates a one dimensional array (column) with the numbers 1 through 50. In order to programmatically change the number, we use the Indirect function to specify the upper bound of the range, N:
ArrayFormula(Row(Indirect("1:"&N)))
N can be a named range, hard coded, or a cell reference containing a number greater than 0. Because you want each row to contain a random number between X and Y, you need to eliminate the sequential number in each array position by multiplying the number generated by the above formula by zero:
ArrayFormula(Row(Indirect("1:"&N))*0)
which generates a on dimensional array (column) of N zeros. Now you can combine this as follows to generate a one dimensional array (column) of N random numbers between X and Y:
Solution
ArrayFormula(RandBetween(Row(Indirect("1:"&N))*0+X, Y))
You could use named ranges for N, X, and Y; hard code them eg. 50, 100, 500; or use simple cell references as in the example below:
ArrayFormula(RandBetween(row(indirect("1:"&B1))*0+B2, B3))
GSheet Example

Filling a row and columns of a ndarray with a loop

I'm starting with Python and I have a basic question with "for" loop
I have two array which contains a values of a same variables:
A = data_lac[:,0]
In the first array, I have values of area and in the second on, values of mean depth.
I would like to find a way to automatize my calculation with different value of a parameter. The equation is the following one:
g= (np.sqrt(A/pi))/n
Here I can calculte my "g" for each row. Now I want to have a loop with differents values of "n". I did this:
i=0
while i <= len(A)-1:
for n in range(2,6):
g[i] = (np.sqrt(A[i]/pi))/n
i += 1
break
In this case, I just have one column with the calculation for n = 2 but not the following one. I tried to add a second dimension to my array but I have an error message saying that I have too many indices for array.
In other, I would like this array:
g[len(A),5]
g has 5 columns each one calculating with a different "n"
Any tips would be very helpful,
Thanks
Update of the code:
data_lac=np.zeros((106,7))
data_lac[:,0:2]=np.loadtxt("/home...", delimiter=';', skiprows=1, usecols=(0,1))
data_lac[:,1]=data_lac[:,1]*0.001
#Initialisation
A = data_lac[:,0]
#example for A with 4 elements
A=[2.1, 32.0, 4.6, 25]
g = np.zeros((len(A),))
I believe you share the indexes within both loops. You were increasing the i (index for the upper while loop) inside the inner for loop (which index with n).
I guess you have A (1 dim array) and you want to produce G (2 dim array) with size of (Len(A, 5))
I am not sure I'm fully understand your require output but I believe you want something like:
i=0
while i <= len(A)-1:
for n in range(2,6):
g[i][n-2] = (np.sqrt(A[i]/pi))/n # n-2 is to get first index as 0 and last as 4
i += 1 # notice the increace of the i is for the upper while loop
break
Important - remember that in python indentation means a lot -> so make sure the i +=1 is under the while scope and not indent to be inside the for loop
Notice - G definition should be as:
g = np.zeros((len(A),4), dtype=float)
The way you define it (without the 4) cause it to be 1 dim array and not 2-dim

Accumulating values of array elements and reiterate for entire array in MATLAB

I have a matrix A of size 3780x30974. This matrix consists of 0 and 1. I would like to calculate the sum of fixed windows of length 21 (180 blocks). This should be reiterated, so that the output returns a vector of size 180x30974.
If the first 21 values in a column have a value of 1, the output should return 21. However, if the following 21 values have a value of 1 again, it should return 21 as well. In my code, it accumulates the values, so I obtain 42.
I have t=3780, p=180, w=21;
B = movsum(A,w); % the sum of a moving window with width w
This question is somehow related to a question previously asked, yet with a different problem setting. I thought about a loop to say "perform from t=1:p", yet it didn't work.
result = permute(sum(reshape(A, w, [], size(A,2)), 1), [2 3 1]);
This works as follows: reshape A into a 3D array of size 21×180×30974:
reshape(A, w, [], size(A,2)), 1)
then sum along the first dimension
sum(..., 1)
and finally remove the first (singleton) dimension by permuting it to the end:
permute(..., [2 3 1])
Note that Matlab arrays have an infinite number of trailing singleton dimensions, so moving a singleton dimension to the end is the same as removing it.

Creating sub-arrays from large single array based on marker values

I need to create a 1-D array of 2-D arrays, so that a program can read each 2-D array separately.
I have a large array with 5 columns, with the second column storing 'marker' data. Depending on the marker value, I need to take the corresponding data from the remaining 4 columns and put them into a new array on its own.
I was thinking of having two for loops running, one to take the target data and write it to a cell in the 1-D array, and one to read the initial array line-by-line, looking for the markers.
I feel like this is a fairly simple issue, I'm just having trouble figuring out how to essentially cut and paste certain parts of an array and write them to a new one.
Thanks in advance.
No for loops needed, use your marker with logical indexing. For example, if your large array is A :
B=A(A(:,2)==marker,[1 3:5])
will select all rows where the marker was present, without the 2nd col. Then you can use reshape or the (:) operator to make it 1D, for example
B=B(:)
or, if you want a one-liner:
B=reshape(A(A(:,2)==marker,[1 3:5]),1,[]);
I am just answering my own question to show any potential future users the solution I came up with eventually.
%=======SPECIFY CSV INPUT FILE HERE========
MARKER_DATA=csvread('ESphnB2.csv'); % load data from csv file
%===================================
A=MARKER_DATA(:,2); % create 1D array for markers
A=A'; % make column into row
for i=1:length(A) % for every marker
if A(i) ~= 231 % if it is not 231 then
A(i)=0; % set value to zero
end
end
edgeArray = diff([0; (A(:) ~= 0); 0]); % set non-zero values to 1
ind = [find(edgeArray > 0) find(edgeArray < 0)-1]; % find indices of 1 and save to array with beginning and end
t=1; % initialize counter for trials
for j=1:size(ind,1) % for every marked index
B{t}=MARKER_DATA(ind(j,1):ind(j,2),[3:6]); % create an array with the rows from the data according to indicies
t=t+1; % create a new trial
end
gazeVectors=B'; % reorient and rename array of trials for saccade analysis
%======SPECIFY MAT OUTPUT FILE HERE===
save('Trial_Data_2.mat','gazeVectors'); % save array to mat file
%=====================================

Matlab Assigning Elements to Array in loop

I have this loop which generates a vector "Diff". How do I place the values of Diff in an array that records all the Diff's generated? The problem is that the length of Diff should be a fixed length (36) which is the width of the table "CleanPrice". But because col_set varies in length (according to the number of NaNs in the data it is reading), then Diff also varies in length. What I need it to do is assign the answers generated according to their appropriate column number. i.e. row(i) of diff should contain col(i) where all other rows in Diff should be assigned a "0" or "NaN". Basically I need DiffArray to be a (nTrials x 36) array where each row is the (36 x 1) DiffArray generated. At the moment though, each time the length of col changes, I get the following error:
??? Subscripted assignment dimension mismatch.
Error in ==> NSSmodel
at 41 DiffMatrix(end+1,:)=Diff
This is my code:
DiffArray=[];
StartRow=2935;
EndRow=2940;
nTrials=EndRow-StartRow;
for row=StartRow:EndRow;
col_set=find(~isnan(gcm3.data.CleanPrice(row,1:end)));
col=col_set(:,2:end);
CleanPrices=transpose(gcm3.data.CleanPrice(row,col));
Maturity=gcm3.data.CouponandMaturity(col-1,2);
SettleDate=gcm3.data.CouponandMaturity(row,3);
Settle = repmat(SettleDate,[length(Maturity) 1]);
CleanPrices =transpose(gcm3.data.CleanPrice(row,col));
CouponRate = gcm3.data.CouponandMaturity(col-1,1);
Instruments = [Settle Maturity CleanPrices CouponRate];
PlottingPoints = gcm3.data.CouponandMaturity(1,2):gcm3.data.CouponandMaturity(36,2);
Yield = bndyield(CleanPrices,CouponRate,Settle,Maturity);
SvenssonModel = IRFunctionCurve.fitSvensson('Zero',SettleDate,Instruments)
ParYield=SvenssonModel.getParYields(Maturity);
[PriceActual, AccruedIntActual] = bndprice(Yield, CouponRate, Settle, Maturity);
[PriceNSS, AccruedIntNSS] = bndprice(ParYield, CouponRate, Settle, Maturity);
Diff=PriceActual-PriceNSS
DiffArray(end+1,:)=Diff
end
I looked at num2cell in this post but wasn't sure how to apply it correctly and started getting errors relating to that instead.
Is it correct to say you want to add an 'incomplete' row to DiffArray? If you know exactly where each element should go you could maybe do something like this:
indices = [1:7; 2:8; 3:9; [1 2 3 6 7 8 10]];
Diff = rand(4, 7);
DiffArray = zeros(4, 10) * NaN;
for row = 1:4
DiffArray(row, indices(row, :)) = Diff(row,:);
end
of course in your case you would be calculating Diff and Index (a row vector) inside the loop and not using preassigned arrays. The above is just to illustrate how to use an indexing vector to position a short row in a matrix.

Resources