calculating frequencies and class implance - artificial-intelligence

I've this class
labels = ['Age',
'Gender',
'healthy',
'glaucoma',
'suspicious']
and this figure:
As shown in the above figure,
I want to make a Compute to the Class Frequencies and then find it's loss function.
by finding the labels " 0 1 " matrix and "0 1" matrix for the positive freqs and neg freqs for the classes in the above figure
I have a 155 negative (0) values and 333 positive (1) values in the healthy column
and 401 negative (0) values and 87 positive (1) values in glaucoma.
and 420 negative (0) values and 68 positive (1) values in suspicious.
I mean I want it to be like this for example:
labels:
[[1 0 0]
[0 1 1]
[1 0 1]
[1 1 1]
[1 0 1]]
pos freqs: [0.8 0.4 0.8]
neg freqs: [0.2 0.6 0.2]
How can i do it?

Related

Computing autocovariance function vector in NumPy without using np.correlate

I'm trying to create a program that uses the Hannan-Rissanen algorithm to compute the sample parameters for an ARMA(p, q) auto-regressive moving average stochastic process.
The main step I'm having difficulty with is calculating the autocovariance function of the time series.
The program should take in an n×1-dimensional column vector Y and compute a k×1-dimensional column vector γ^hat given by:
acvf equation image
where Ybar is the average of the elements of Y.
How can I compute the above sum efficiently? (obviously for loops would work, but I'm trying to get better at vectorized numpy operations) Since I'm using this as a learning experience, I would prefer not to use any numpy functions other than very basic ones like np.sum or np.mean.
The following previous similar question has been asked, but doesn't quite answer my question:
Computing autocorrelation of vectors with numpy (uses np.correlate)
(a few others suffer the same problem of using more advanced numpy functions, or aren't spitting out vectors as I wish to do here.)
Here is one way to replace np.correlate (which I assume is the main difficulty; I'm also assuming you have no desire to hand code an fft):
def autocorr_direct(a, debug=False):
n, _ = a.shape
out = np.zeros((n+1, 2*n-1), a.dtype)
out.reshape(-1)[:2*n*n].reshape(n, 2*n)[::-1, :n] = a*a.T
if debug:
print(out.reshape(-1)[:2*n*n].reshape(n, 2*n))
print(out)
return out.sum(0)
For example:
>>> a = np.array([[1, 1, 2, -1]]).T
>>> autocorr_direct(a, True)
[[-1 -1 -2 1 0 0 0 0]
[ 2 2 4 -2 0 0 0 0]
[ 1 1 2 -1 0 0 0 0]
[ 1 1 2 -1 0 0 0 0]]
[[-1 -1 -2 1 0 0 0]
[ 0 2 2 4 -2 0 0]
[ 0 0 1 1 2 -1 0]
[ 0 0 0 1 1 2 -1]
[ 0 0 0 0 0 0 0]]
array([-1, 1, 1, 7, 1, 1, -1])
>>> np.correlate(a[:, 0], a[:, 0], 'full')
array([-1, 1, 1, 7, 1, 1, -1])
Note the reshape trick that shears the square array a[::-1]*a.T.
Note 2; to get a column vector from a 1D vector X use X[:, None].

Extract indices of sets of values greater than zero in an array

I have an array of length n. The array has braking energy values, and the index number represents time in seconds.
The structure of array is as follows:
Index 1 to 140, array has zero values. (Vehicle not braking)
Index 141 to 200, array has random energy values. (Vehicle was braking and regenerating energy)
Index 201 to 325, array has zero values. (Vehicle not braking)
Index 326 to 405, array has random energy values. (Vehicle was braking and regenerating energy)
...and so on for an array of length n.
What I want to do is to get starting and ending index number of each set of energy values.
For example the above sequence gives this result:
141 - 200
326 - 405
...
Can someone please suggest what method or technique can I use to get this result?
Using diff is a quick way to do this.
Here is a demo (see the comments for details):
% Junk data for demo. Indices shown above for reference
% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
x = [0, 0, 0, 2, 3, 4, 0, 0, 1, 1, 7, 9, 3, 4, 0, 0, 0];
% Logical converts all non-zero values to 1
% diff is x(2:end)-x(1:end-1), so picks up on changes to/from zeros
% Instead of 'logical', you could have a condition here,
% e.g. bChange = diff( x > 0.5 );
bChange = diff( logical( x ) );
% bChange is one of the following for each consecutive pair:
% 1 for [0 1] pairs
% 0 for [0 0] or [1 1] pairs
% -1 for [1 0] pairs
% We inflate startIdx by 1 to index the non-zero value
startIdx = find( bChange > 0 ) + 1; % Indices of [0 1] pairs
endIdx = find( bChange < 0 ); % Indices of [1 0] pairs
I'll leave it as an exercise to capture the edge cases where you add a start or end index if the array starts or ends with a non-zero value. Hint: you could handle each case separately or pad the initial x with additional end values.
Output of the above:
startIdx
>> [4, 9]
endIdx
>> [6, 14]
So you can format this however you like to get the spans 4-6, 9-14.
This task is performed by two methods Both works perfectly.
Wolfie Method:
bChange = diff( EnergyB > 0 );
startIdx = find( bChange > 0 ) + 1; % Indices of [0 1] pairs
endIdx = find( bChange < 0 ); % Indices of [1 0] pairs
Result:
startIdx =
141
370
608
843
endIdx =
212
426
642
912
Second Method:
startends = find(diff([0; EnergyB > 0; 0]));
startends = reshape(startends, 2, [])';
startends(:, 2) = startends(:, 2) - 1
Result:
startends =
141 212
370 426
608 642
843 912

Counting values along an axis in a 3D array that are greater than threshold values from a 2D array

I have a 3D array of dimensions (200,200,3). These are images of dimensions (200,200) stacked using numpy.dstack. I would like to count the number of values along axis=2 that are greater than a corresponding 2D threshold array of dimensions (200,200). The output counts array should have dimensions (200,200). Here is my code so far.
import numpy as np
stacked_images=np.random.rand(200,200,3)
threshold=np.random.rand(200,200)
counts=(stacked_images<threshold).sum(axis=2)
I am getting the following error.
ValueError: operands could not be broadcast together with shapes (200,200,3) (200,200)
The code works if threshold is an integer/float value. For example.
threshold=0.3
counts=(stacked_images<threshold).sum(axis=2)
Is there a simple way to do this if threshold is a 2D array? I guess I am not understanding numpy broadcasting rules correctly.
numpy is expecting to make a value by value operation. In your case you seem to be wanting to know if any value in the full Z (axis=2) trace exceeds the equivalent x, y value in threshold.
As so just make sure threshold has the same shape, namely by building a 3D threshold using whatever method you prefer. Since you mentioned numpy.dstack:
import numpy as np
stacked_images = np.random.rand(10, 10, 3)
t = np.random.rand(10, 10)
threshold = np.dstack([t, t, t])
counts = (stacked_images < threshold).sum(axis=2)
print(counts)
, which results in:
[[2 0 3 3 1 3 1 0 1 2]
[0 1 2 0 0 1 0 0 1 3]
[2 1 3 0 3 2 1 3 1 3]
[2 0 0 3 3 2 0 2 0 1]
[1 3 0 0 0 3 0 2 1 2]
[1 1 3 2 3 0 0 3 0 3]
[3 1 0 1 2 0 3 0 0 0]
[3 1 2 1 3 0 3 2 0 2]
[3 1 1 2 0 0 1 0 1 0]
[0 2 2 0 3 0 0 2 3 1]]

How to find adjacency matrix given a set of links and edges in matlab

I have vector of all edges for example
A = [1;2;3;4];
I also have the matrix of all the links connecting these edges represented by the edge numbers for example
B = [1 3;3 1;1 2;1 2;2 3;4 3];
I would like to construct the adjacency matrix with this data. The matrix should not consider the ordering of the edges in the links For example the second link has edges 1 2 but the matrix should have entries in both 1,2 and 2,1.
So therefore i need an output like this
C = [0 1 1 0;1 0 1 0;1 1 0 1;0 0 1 0];
I cannot think of any other way other than using a for loop for the size of B and then finding the egdes for each link in B and then adding 1's to a pre-initialized 4x4 matrix at i,j where i,j is the link edges.
Is this an efficient way because my real size is many magnitudes greater than 4? Could someone help with a better way to construct the matrix?
You can use sparse to build the matrix, and then optionally convert to full:
result = full(sparse(B(:,1), B(:,2), 1)); % accumulate values
result = result | result.'; % make symmetric with 0/1 values
Equivalently, you can use accumarray:
result = accumarray(B, 1); % accumulate values
result = result | result.'; % make symmetric with 0/1 values
For A = [1;2;3;4]; B = [1 3;3 1;1 2;1 2;2 3;4 3], either of the above gives
result =
4×4 logical array
0 1 1 0
1 0 1 0
1 1 0 1
0 0 1 0

how to use matlab logical functions with arrays

I am using Matlab.
I compare a zero array A with some other arrays (e.g. [1 1 0 0])
I write the following code:
A=[0 0 0 0];
if (A~=[1 1 0 0] & A~=[1 0 1 0] & A~=[1 1 0 1])
x=1;
else
x=0;
end
I expected to see that x=1 but the answer i get is x=0
what do i wrong ?
~= and & are element wise operators, so the expression
A~=[1 1 0 0] & A~=[1 0 1 0] & A~=[1 1 0 1]
where A = [0 0 0 0] produces the vector output:
[1 0 0 0]
An if statement evaluated on a vector does an implicit all, which in that case evaluates to false.
It's not exactly clear what you want, but if you want to make sure the vector A is not equal to any of [1 1 0 0], [1 0 1 0] or [1 1 0 1] then you need to do this:
x = ~isequal(A, [1 1 0 0]) && ~isequal(A, [1 0 1 0]) && ~isequal(A, [1 1 0 1])
The matlab equality operators compares array element-wise and returns true/false (logical 1/0) for each element. So when you have A = [1 1 0 0], B = [1 0 1 0] and you check for A == B, you don't get 'false' but instead you get [1 0 0 1].
If you want to check if the whole vectors A and B are equal you need to check if the condition
all(A==B)is true or not
I think you are looking to find exact match, so for that you may use this -
%%// If *ANY* of the element-wise comparisons are not-true, give me 1,
%%// otherwise give me 0. Thus, in other words, I am looking to find the
%%// exact match (element-wise) only, otherwise give me 1 as x.
if any(A~=[1 1 0 0] & A~=[1 0 1 0] & A~=[1 1 0 1])
x=1;
else
x=0;
end
Another way to put it would be -
%%// *ALL* elementwise comparisons must be satisfied, to give x as 0,
%%// otherwise give x as 1.
if all(A==[1 1 0 0] & A==[1 0 1 0] & A==[1 1 0 1])
x=0;
else
x=1;
end
Get more info about any and all, that are used here.

Resources