Ludwig when to use SET feature, SEQUENCE feature, examples data for both - ludwig

How to determine when to use Set Features vs Sequence Features on a column and difference between them with some examples.
I'm trying to use Ludwig to perform classification. My dataset looks something like below:
Letters here are just for representational purpose only
For example feature 1 (alpha word) could stand for ^al lph pha ha$ (trigram here)
LABEL, Feature1, Feature2
X, A B C, D A E
X, B C K, K J L
Y, A D C, D A E
Y, B D E, J L R
name: Feature1_trigrams
type: set
level: words
encoder:
representation: dense
embedding_size: 10
embeddings_on_cpu: false
pretrained_embeddings: null
embeddings_trainable: true
dropout: false
initializer: null
regularize: true
reduce_output: sqrt
tied_weights: null
cell_type: lstm
bidirectional: true
num_layers: 2
reduce_output: null
preprocessing:
format: space
Should I be using Sequence instead?

Related

What would be an idiomatic F# way to scale a list of (n-tuples or list) with another list, also arrays?

Given:
let weights = [0.5;0.4;0.3]
let X = [[2;3;4];[7;3;2];[5;3;6]]
what I want is: wX = [(0.5)*[2;3;4];(0.4)*[7;3;2];(0.3)*[5;3;6]]
would like to know an elegant way to do this with lists as well as with arrays. Additional optimization information is welcome
You write about a list of lists, but your code shows a list of tuples. Taking the liberty to adjust for that, a solution would be
let weights = [0.5;0.4;0.3]
let X = [[2;3;4];[7;3;2];[5;3;6]]
X
|> List.map2 (fun w x ->
x
|> List.map (fun xi ->
(float xi) * w
)
) weights
Depending on how comfortable you are with the syntax, you may prefer a oneliner like
List.map2 (fun w x -> List.map (float >> (*) w) x) weights X
The same library functions exist for sequences (Seq.map2, Seq.map) and arrays (in the Array module).
This is much more than an answer to the specific question but after a chat in the comments and learning that the question was specifically a part of a neural network in F# I am posting this which covers the question and implements the feedforward part of a neural network. It makes use of MathNet Numerics
This code is an F# translation of part of the Python code from Neural Networks and Deep Learning.
Python
def backprop(self, x, y):
"""Return a tuple ``(nabla_b, nabla_w)`` representing the
gradient for the cost function C_x. ``nabla_b`` and
``nabla_w`` are layer-by-layer lists of numpy arrays, similar
to ``self.biases`` and ``self.weights``."""
nabla_b = [np.zeros(b.shape) for b in self.biases]
nabla_w = [np.zeros(w.shape) for w in self.weights]
# feedforward
activation = x
activations = [x] # list to store all the activations, layer by layer
zs = [] # list to store all the z vectors, layer by layer
for b, w in zip(self.biases, self.weights):
z = np.dot(w, activation)+b
zs.append(z)
activation = sigmoid(z)
activations.append(activation)
F#
module NeuralNetwork1 =
//# Third-party libraries
open MathNet.Numerics.Distributions // Normal.Sample
open MathNet.Numerics.LinearAlgebra // Matrix
type Network(sizes : int array) =
let mutable (_biases : Matrix<double> list) = []
let mutable (_weights : Matrix<double> list) = []
member __.Biases
with get() = _biases
and set value =
_biases <- value
member __.Weights
with get() = _weights
and set value =
_weights <- value
member __.Backprop (x : Matrix<double>) (y : Matrix<double>) =
// Note: There is a separate member for feedforward. This one is only used within Backprop
// Note: In the text layers are numbered from 1 to n with 1 being the input and n being the output
// In the code layers are numbered from 0 to n-1 with 0 being the input and n-1 being the output
// Layers
// 1 2 3 Text
// 0 1 2 Code
// 784 -> 30 -> 10
let feedforward () : (Matrix<double> list * Matrix<double> list) =
let (bw : (Matrix<double> * Matrix<double>) list) = List.zip __.Biases __.Weights
let rec feedfowardInner layer activation zs activations =
match layer with
| x when x < (__.NumLayers - 1) ->
let (bias, weight) = bw.[layer]
let z = weight * activation + bias
let activation = __.Sigmoid z
feedfowardInner (layer + 1) activation (z :: zs) (activation :: activations)
| _ ->
// Normally with recursive functions that build list for returning
// the final list(s) would be reversed before returning.
// However since the returned list will be accessed in reverse order
// for the backpropagation step, we leave them in the reverse order.
(zs, activations)
feedfowardInner 0 x [] [x]
In weight * activation * is an overloaded operator operating on Matrix<double>
Related back to your example data and using MathNet Numerics Arithmetics
let weights = [0.5;0.4;0.3]
let X = [[2;3;4];[7;3;2];[5;3;6]]
first the values for X need to be converted to float
let x1 = [[2.0;3.0;4.0];[7.0;3.0;2.0];[5.0;3;0;6;0]]
Now notice that x1 is a matrix and weights is a vector
so we can just multiply
let wx1 = weights * x1
Since the way I validated the code was a bit more than most I will explain it so that you don't have doubts to its validity.
When working with Neural Networks and in particular mini-batches, the starting numbers for the weights and biases are random and the generation of the mini-batches is also done randomly.
I know the original Python code was valid and I was able to run it successfully and get the same results as indicated in the book, meaning that the initial successes were within a couple of percent of the book and the graphs of the success were the same. I did this for several runs and several configurations of the neural network as discussed in the book. Then I ran the F# code and achieved the same graphs.
I also copied the starting random number sets from the Python code into the F# code so that while the data generated was random, both the Python and F# code used the same starting numbers, of which there are thousands. I then single stepped both the Python and F# code to verify that each individual function was returning a comparable float value, e.g. I put a break point on each line and made sure I checked each one. This actually took a few days because I had to write export and import code and massage the data from Python to F#.
See: How to determine type of nested data structures in Python?
I also tried a variation where I replaced the F# list with Linked list, but found no increase in speed, e.g. LinkedList<Matrix<double>>. Was an interesting exercise.
If I understand correctly,
let wX = weights |> List.map (fun w ->
X |> List.map (fun (a, b, c) ->
w * float a,
w * float b,
w * float c))
This is an alternate way to achieve this using Math.Net: https://numerics.mathdotnet.com/Matrix.html#Arithmetics

How to obtain elements of an array close to another array in MATLAB?

There must be an easier way to do this, optimization method is also welcome. I have an array 'Y' and many parameters that has to be adjusted such that Y nears zero (= 'X') as given in the MWE. Is there a much better procedure to minimize this difference? This is just an example equation, there can be 6 coefficients to optimized.
x = zeros(10,1)
y = rand(10,1)
for a=1:0.1:4
for b=2:0.1:5
for c = 3:0.1:6
z = (a * y .^ 3 + b * y + c) - x
if -1<= range(z) <= 1
a, b, c
break
end
end
end
end
I believe
p = polyfit(y,x,2);
is what you are looking for.
where p will be an array of your [a, b, c] coefficients.

Find minimal candidate keys

I have 2 sets
R = {A B C D}
H = {AB-> C , AB-> D, D-> B}
Want to find all minimal keys in R set
My answer for minimal keys is : { A D }
this is because
AB -> C and AB -> D then AB -> CD
since D -> B then AD is the minimal keys
when i check my answer with this site. the site giving wrong answer.
can explain?
The site says, "Set of found candidate-keys: {{A, C, F}, {B, C, F}}." That's clearly wrong; F isn't even in R.
In any case, your answer is incomplete. AD is one of two candidate keys.

R: Aggregate on Group 1 and NOT Group 2

I am trying to create two data sets, one which summarizes data by 2 groups which I have done using the following code:
x = rnorm(1:100)
g1 = sample(LETTERS[1:3], 100, replace = TRUE)
g2 = sample(LETTERS[24:26], 100, replace = TRUE)
aggregate(x, list(g1, g2), mean)
The second needs to summarize the data by the first group and NOT the second group.
If we consider the possible pairs from the previous example:
A - X B - X C - X
A - Y B - Y C - Y
A - Z B - Z C - Z
The second dataset should to summarize the data as the average of the outgroup.
A - not X
A - not Y
A - not Z etc.
Is there a way to manipulate aggregate functions in R to achieve this?
Or I also thought there could be dummy variable that could represent the data in this way, although I am unsure how it would look.
I have found this answer here:
R using aggregate to find a function (mean) for "all other"
I think this indicates that a dummy variable for each pairing is necessary. However if there is anyone who can offer a better or more efficient way that would be appreciated, as there are many pairings in the true data set.
Thanks in advance
First let us generate the data reproducibly (using set.seed):
# same as question but added set.seed for reproducibility
set.seed(123)
x = rnorm(1:100)
g1 = sample(LETTERS[1:3], 100, replace = TRUE)
g2 = sample(LETTERS[24:26], 100, replace = TRUE)
Now we have two solutions both of which use aggregate:
1) ave
# x equals the sums over the groups and n equals the counts
ag = cbind(aggregate(x, list(g1, g2), sum),
n = aggregate(x, list(g1, g2), length)[, 3])
ave.not <- function(x, g) ave(x, g, FUN = sum) - x
transform(ag,
x = NULL, # don't need x any more
n = NULL, # don't need n any more
mean = x/n,
mean.not = ave.not(x, Group.1) / ave.not(n, Group.1)
)
This gives:
Group.1 Group.2 mean mean.not
1 A X 0.3155084 -0.091898832
2 B X -0.1789730 0.332544353
3 C X 0.1976471 0.014282465
4 A Y -0.3644116 0.236706489
5 B Y 0.2452157 0.099240545
6 C Y -0.1630036 0.179833987
7 A Z 0.1579046 -0.009670734
8 B Z 0.4392794 0.033121335
9 C Z 0.1620209 0.033714943
To double check the first value under mean and under mean.not:
> mean(x[g1 == "A" & g2 == "X"])
[1] 0.3155084
> mean(x[g1 == "A" & g2 != "X"])
[1] -0.09189883
2) sapply Here is a second approach which gives the same answer:
ag <- aggregate(list(mean = x), list(g1, g2), mean)
f <- function(i) mean(x[g1 == ag$Group.1[i] & g2 != ag$Group.2[i]]))
ag$mean.not = sapply(1:nrow(ag), f)
ag
REVISED Revised based on comments by poster, added a second approach and also made some minor improvements.

C variable assignment and R equivalent

Hi I am trying to understand the following variable assignment in C, and try re-write it in R. I use R often but have only really glanced at C.
int age,int b_AF,int b_ra,int b_renal,int b_treatedhyp,int b_type2,double bmi,int ethrisk,int fh_cvd,double rati,double sbp,int smoke_cat,int surv,double town
)
{
double survivor[3] = {
0,
0.996994316577911,
0.993941843509674
};
a = /*pre assigned*/
double score = 100.0 * (1 - pow(survivor[surv], exp(a)) );
return(score);
}
how does survivor[surv] work in this context? An explanation would be helpful, and any input on how to do the assignment in R would be a bonus.
Thanks very much!
This is an aggregate initializer:
double survivor[3] = {
0,
0.996994316577911,
0.993941843509674
};
and is equivalent to:
double survivor[3];
survivor[0] = 0;
survivor[1] = 0.996994316577911;
survivor[2] = 0.993941843509674;
and survivor[surv] is the value stored at index of the survivor array. Array indexes run from 0 to N - 1 so if surv was 1 then survivor[surv] has value of 0.996994316577911.
Note, the function as currently written does not check that surv is a valid index for the array survivor (i.e. surv > -1 and surv < 3) and runs the risk of undefined behaviour.
Given the Answer of #hmjd then, the R equivalent would be
survivor <- c(0, 0.996994316577911, 0.993941843509674)
or if survivor already exists and you wish to assign into the first 3 elements:
survivor[1:3] <- c(0, 0.996994316577911, 0.993941843509674)
(Note R's indices are 1-based unlike C's 0-based ones.)
As for the extraction, the general idea is the same as with C, but the details matter:
R> survivor[0] ## 0 index returns an empty vector
numeric(0)
R> survivor[-1] ## negative index **drops** that element
[1] 0.9969943 0.9939418
R> survivor[10] ## positive outside length of vector returns NA
[1] NA
R> surv <- 2
R> survivor[surv] ## same holds for whatever surv contains
[1] 0.9969943

Resources