Pairwise Cohen's Kappa of rows in DataFrame in Pandas (python) - arrays

I'd greatly appreciate some help on this. I'm using jupyter notebook.
I have a dataframe where I want calculate the interrater reliability. I want to compare them pairwise by the value of the ID column (all IDs have a frequency of 2, one for each coder). All ID values represent different articles, so I do not want to compare them all together, but more take the average of the interrater reliability of each pair (and potentially also for each column).
N. ID. A. B.
0 8818313 Yes Yes 1.0 1.0 1.0 1.0 1.0 1.0
1 8818313 Yes No 0.0 1.0 0.0 0.0 1.0 1.0
2 8820105 No Yes 0.0 1.0 1.0 1.0 1.0 1.0
3 8820106 No No 0.0 0.0 0.0 1.0 0.0 0.0
I've been able to find some instructions of the cohen's k, but not of how to do this pairwise by value in the ID column.
Does anyone know how to go about this?

Here is how I will approach it:
from io import StringIO
from sklearn.metrics import cohen_kappa_score
df = pd.read_csv(StringIO("""
N,ID,A,B,Nums
0, 8818313, Yes, Yes,1.0 1.0 1.0 1.0 1.0 1.0
1, 8818313, Yes, No,0.0 1.0 0.0 0.0 1.0 1.0
2, 8820105, No, Yes,0.0 1.0 1.0 1.0 1.0 1.0
3, 8820105, No, No,0.0 0.0 0.0 1.0 0.0 0.0 """))
def kappa(df):
nums1 = [float(num) for num in df.Nums.iloc[0].split(' ') if num]
nums2 = [float(num) for num in df.Nums.iloc[1].split(' ') if num]
return cohen_kappa_score(nums1, nums2)
df.groupby('ID').apply(kappa)
This will generate:
ID
8818313 0.000000
8820105 0.076923
dtype: float64

Related

Is it possible to plot a series of matrices in this format using gnuplot

My code currently creates an output that comes out as so (using example numbers)
0.0 0.0 0.0
0.0 1.0 0.0
0.0 0.0 0.0
0.0 0.0 2.0
0.0 0.0 0.0
0.0 0.0 2.0
0.0 0.0 0.0
1.0 0.0 0.0
0.0 0.0 0.0
0.0 3.0 0.0
0.0 0.0 0.0
0.0 0.0 1.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
was hoping for a solution on how to plot this data either as a 3D splot or as a gif that cycles through each matrix (actual code contains a few hundred matrices). I'm able to alter the output format if necessary. So far I've tried
do for [i=1:7] {
plot "data.txt" matrix with image
}
As well as attempting other solutions I've found on the site but none seem to be trying to do the same thing as me.
If anyone who has gnuplot experience could help me that would be a huge help (I'm using mac if that makes a difference)
Welcome to StackOverflow! I assume all separations of your matrices are two empty lines.
If this is the case you can address the matrices via index (check help index).
You can find out with stats (check help stats) how many blocks you have. Loop through these blocks and set the output to term gif animate (check help gif). Instead of plotting the datablock $Data simply plot your file.
Scrupt:
### plot matrices as asnimation
reset session
$Data <<EOD
0.0 0.0 0.0
0.0 1.0 0.0
0.0 0.0 0.0
0.0 0.0 2.0
0.0 0.0 0.0
0.0 0.0 2.0
0.0 0.0 0.0
1.0 0.0 0.0
0.0 0.0 0.0
0.0 3.0 0.0
0.0 0.0 0.0
0.0 0.0 1.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
3.0 0.0 0.0
0.0 0.0 0.0
1.0 0.0 2.0
0.0 0.0 0.0
EOD
stats $Data u 0 nooutput # get the number of blocks
N = STATS_blocks
set term gif size 600,400 animate delay 30
set output "SO72250259.gif"
set size ratio -1
set cbrange [0:3]
set xrange [-0.5:2.5]
set yrange [-0.5:3.5]
do for [i=0:N-1] {
plot $Data index i matrix w image
}
set output
### end of script
Result:

convert Array to indicator matrix

Given the y Array, is there a cleaner or more idiomatic way to create a 2D Array such as Y?
y = [1.0 2.0 3.0 4.0 1.0 2.0]'
Y = ifelse(y .== 1, 1.0, 0.0)
for j in 2:length(unique(y))
Y = hcat(Y, ifelse(y .== j, 1.0, 0.0) )
end
julia> Y
6x4 Array{Float64,2}:
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
One alternative approach is to use broadcast:
julia> broadcast(.==, y, (1:4)')
6x4 Array{Float64,2}:
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
(.== broadcasts automatically, so if you just wanted a BitArray you could write y .== (1:4)'.)
This avoids the explicit for loop and also the use of hcat to build the array. However, depending on the size of the array you're looking to create, it might be most efficient to allocate an array of zeros of the appropriate shape and then use indexing to add the ones to the appropriate column on each row.
Array comprehension is an idiomatic and fast way to create matrices in Julia. For the example in the question:
y = convert(Vector{Int64},vec(y)) # make sure indices are integer
Y = [j==y[i] ? 1.0 : 0.0 for i=1:length(y),j=1:length(unique(y))]
What was probably intended was:
Y = [j==y[i] ? 1.0 : 0.0 for i=1:length(y),j=1:maximum(y)]
In both cases Y is:
6x4 Array{Float64,2}:
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
In numerical analysis, a sparse matrix is a matrix in which most of
the elements are zero.
And from Julia Doc:
sparse(I,J,V,[m,n,combine])
Create a sparse matrix S of dimensions m x n such that S[I[k], J[k]] =
V[k]. The combine function is used to combine duplicates. If m and n
are not specified, they are set to max(I) and max(J) respectively. If
the combine function is not supplied, duplicates are added by default.
y = [1, 2, 3, 4, 1, 2]
rows=length(y);
clms=4 # must be >= maximum(y);
s=sparse(1:rows,y,ones(rows),rows,clms);
full(s) # =>
6x4 Array{Float64,2}:
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0

Multiple assignment in multidimensional array

I have a 4x4 array of zeros.
julia> X = zeros(4,4)
4x4 Array{Float64,2}:
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
I have an 2xN array containing indices of elements in X that I want to assign a new value.
julia> ind = [1 1; 2 2; 3 3]
3x2 Array{Int64,2}:
1 1
2 2
3 3
What is the simplest way to assign a value to all elements in X whose indices are rows in ind? (something like X[ind] = 2.0).
julia> X
2.0 0.0 0.0 0.0
0.0 2.0 0.0 0.0
0.0 0.0 2.0 0.0
0.0 0.0 0.0 0.0
I'm not sure there is a non-looping way to do this. What's wrong with this?
for i=[1:size(ind)[1]]
a, b = ind[i, :]
X[a, b] = 2.0
end
user3467349's answer is correct, but inefficient, because it allocates an Array for the indices. Also, the notation [a:b] is deprecated as of Julia 0.4. Instead, you can use:
for i = 1:size(ind, 1)
a, b = ind[i, :]
X[a, b] = 2.0
end

How do we Initialize a Hopfield Neural Network?

I have just started reading about neural networks and I have a basic question. Regarding "initializing" the Hopfield network, I am unable to understand that notion of initialization. That is, do we input some random numbers? or do input a well defined pattern which makes the neurons settle down first time up, assuming all neurons were at state equal to zero, with other stable states being either 1 or -1 after the input.
Consider the neural network below. Which I have taken from HeatonResearch
Glad if someone clears this to me.
When initialising neural networks, including the recurrent Hopfield networks, it is common to initialise with random weights, as that in general will give good learning times over multiple trials and over an ensemble of runs, it will avoid local minima. It is usually not a good idea to start from the same starting weights over multiple runs as you will likely encounter the same local minima. With some configurations, the learning can be sped up by doing an analysis of the role of the node in the functional mapping, but that is often a later step in the analysis after getting something working.
The purpose of a Hopefiled network is to recall the data it has been shown, serving as content-addressable memory. It begins as a clean slate, with all weights set to zero. Training the network on a vector adjusts the weights to respond to it.
The output of a node in a Hopfield network depends on the state of each other node and the weight of the node's connection to it. States correspond to the input, with intput 0 mapping to -1, and the input 1 mapping to 1. So, if the network in your example had input 1010, N1 would have state 1, N2 -1, N3 1, and N4 -1.
Training the network means adding the dot product between the output and itself to the weight matrix setting the diagonal to zero. So, to train on 10101, we would add [1 -1 1 -1 ] · [1 -1 1 -1 ]ᵀ to the weight matrix.
You can checkout this repository --> Hopfield Network
There you have an example for test a pattern after train the Network off-line. This is the test
#Test
public void HopfieldTest(){
double[] p1 = new double[]{1.0, -1.0,1.0,-1.0,1.0,-1.0,1.0,-1.0,1.0};
double[] p2 = new double[]{1.0, 1.0,1.0,-1.0,1.0,-1.0,-1.0,1.0,-1.0};
double[] p3 = new double[]{1.0, 1.0,-1.0,-1.0,1.0,-1.0,-1.0,1.0,-1.0};
ArrayList<double[]> patterns = new ArrayList<>();
patterns.add(p1);
patterns.add(p2);
Hopfield h = new Hopfield(9, new StepFunction());
h.train(patterns); //train and load the Weight matrix
double[] result = h.test(p3); //Test a pattern
System.out.println("\nConnections of Network: " + h.connections() + "\n"); //show Neural connections
System.out.println("Good recuperation capacity of samples: " + Hopfield.goodRecuperation(h.getWeights().length) + "\n");
System.out.println("Perfect recuperation capacity of samples: " +
Hopfield.perfectRacuperation(h.getWeights().length) + "\n");
System.out.println("Energy: " + h.energy(result));
System.out.println("Weight Matrix");
Matrix.showMatrix(h.getWeights());
System.out.println("\nPattern result of test");
Matrix.showVector(result);
h.showAuxVector();
}
And after run the test you can see
Running HopfieldTest
Connections of Network: 72
Good recuperation capacity of samples: 1
Perfect recuperation capacity of samples: 1
Energy: -32.0
Weight Matrix
0.0 0.0 2.0 -2.0 2.0 -2.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 -2.0 2.0 -2.0
2.0 0.0 0.0 -2.0 2.0 -2.0 0.0 0.0 0.0
-2.0 0.0 -2.0 0.0 -2.0 2.0 0.0 0.0 0.0
2.0 0.0 2.0 -2.0 0.0 -2.0 0.0 0.0 0.0
-2.0 0.0 -2.0 2.0 -2.0 0.0 0.0 0.0 0.0
0.0 -2.0 0.0 0.0 0.0 0.0 0.0 -2.0 2.0
0.0 2.0 0.0 0.0 0.0 0.0 -2.0 0.0 -2.0
0.0 -2.0 0.0 0.0 0.0 0.0 2.0 -2.0 0.0
Pattern result of test
1.0 1.0 1.0 -1.0 1.0 -1.0 -1.0 1.0 -1.0
-------------------------
The auxiliar vector is empty
I hope you find it useful. Regards

Hopfield neural network

do you know any application beside pattern recog. worthe in order to implement Hopfield neural network model?
Recurrent neural networks (of which hopfield nets are a special type) are used for several tasks in sequence learning:
Sequence Prediction (Map a history of stock values to the expected value in the next timestep)
Sequence classification (Map each complete audio snippet to a speaker)
Sequence labelling (Map an audio snippet to the sentence spoken)
Non-markovian reinforcement learning (e.g. tasks that require deep memory as the T-Maze benchmark)
I am not sure what you mean by "pattern recognition" exactly, since it basically is a whole field into which each task for which neural networks can be used fits.
You can use Hopfield network for optimization problems as well.
You can checkout this repository --> Hopfield Network
There you have an example for test a pattern after train the Network off-line.
This is the test
#Test
public void HopfieldTest(){
double[] p1 = new double[]{1.0, -1.0,1.0,-1.0,1.0,-1.0,1.0,-1.0,1.0};
double[] p2 = new double[]{1.0, 1.0,1.0,-1.0,1.0,-1.0,-1.0,1.0,-1.0};
double[] p3 = new double[]{1.0, 1.0,-1.0,-1.0,1.0,-1.0,-1.0,1.0,-1.0};
ArrayList<double[]> patterns = new ArrayList<>();
patterns.add(p1);
patterns.add(p2);
Hopfield h = new Hopfield(9, new StepFunction());
h.train(patterns); //train and load the Weight matrix
double[] result = h.test(p3); //Test a pattern
System.out.println("\nConnections of Network: " + h.connections() + "\n"); //show Neural connections
System.out.println("Good recuperation capacity of samples: " + Hopfield.goodRecuperation(h.getWeights().length) + "\n");
System.out.println("Perfect recuperation capacity of samples: " + Hopfield.perfectRacuperation(h.getWeights().length) + "\n");
System.out.println("Energy: " + h.energy(result));
System.out.println("Weight Matrix");
Matrix.showMatrix(h.getWeights());
System.out.println("\nPattern result of test");
Matrix.showVector(result);
h.showAuxVector();
}
And after run the test you can see
Running HopfieldTest
Connections of Network: 72
Good recuperation capacity of samples: 1
Perfect recuperation capacity of samples: 1
Energy: -32.0
Weight Matrix
0.0 0.0 2.0 -2.0 2.0 -2.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 -2.0 2.0 -2.0
2.0 0.0 0.0 -2.0 2.0 -2.0 0.0 0.0 0.0
-2.0 0.0 -2.0 0.0 -2.0 2.0 0.0 0.0 0.0
2.0 0.0 2.0 -2.0 0.0 -2.0 0.0 0.0 0.0
-2.0 0.0 -2.0 2.0 -2.0 0.0 0.0 0.0 0.0
0.0 -2.0 0.0 0.0 0.0 0.0 0.0 -2.0 2.0
0.0 2.0 0.0 0.0 0.0 0.0 -2.0 0.0 -2.0
0.0 -2.0 0.0 0.0 0.0 0.0 2.0 -2.0 0.0
Pattern result of test
1.0 1.0 1.0 -1.0 1.0 -1.0 -1.0 1.0 -1.0
-------------------------
The auxiliar vector is empty
I hope this can help you

Resources