Tensorflow seq2seq - confidence of reply - artificial-intelligence

I want to know if there is a way in Tensorflow's seq2seq framework where I can know if a reply to an input can be given with x% of confidence.
An example below:
I have hi as reply to hello. It works fine. I also have bunch of other trained sentences. However, let's say I enter some junk like this - sdjshj sdjk oiqwe qw. Seq2seq still tries to give a response. I understand it designed that way, but I want to know if there is a way which says the framework cannot answer this with confidence. Or no such words were trained.
This would be of great help.

Use logistic function (or sigmoid) on the output logits:
Because logit function is basically the inverse of sigmoid function:
Logit Function:
Sigmoid Function:
You can see that it is similar. In tensorflow. there is the sigmoid function, but I find the program is faster when you just code the sigmoid function:
If you use sigmoid function. You will get a value from 0 to 1 which is the confidence you are looking for. More information can be found here:
https://en.wikipedia.org/wiki/Sigmoid_function
https://en.wikipedia.org/wiki/Logit

I think average perplexityreturned by seq2seq_model.model.stop is the confidence, the smaller, the better. But it could be hard for one to tell a proper threshhold.

Related

What's the different of "classify" between softmax, logistic and svm?

I'm using caffe to do the object detection with SSD model, and recently work I adjust the loss type of "MultiBoxLoss".
In the multibox_loss_layer.cpp file, its loss has SOFTMAX as default and LOGISTIC option, I add the hingeloss(SVM) option into caffe code, and do the training but the result is bad.
Now the boss want me to use SVM to classify the feature map by python sklearn.
And a question come across to me, in the multibox_loss_layer.cpp file, there can use the softmax, logistic and hingeloss to calculate the loss. On this step, its data is just "one-dimension", but the feature map is high-dimension, and I internet the article, it seem softmax can't classify high-dimension data.
Ex: if there have three class: cat, dog and rabbit, then it's one-dimension data just have three value to represent cat, dog and rabbit(one value for each class), but the high-dimension data, it have many value(like feature map) for each class, and on the high-dimension case, softmax seems have no work for this.
so I wonder what's the different between softmax, logistic and SVM. Can anybody help? thank you!
Never seen applying SVM loss function into NN. However softmax is a loss function which should be used in order to optimize solution multiclass classifiaction problem. Softmax "transform" NN outputs into probability of each class occurance. Logistic function usually optimize each neuron output as a logistic problem, so it's not force output to be only one class. You should use this function if you want to solve multi labeling problem.
SVM is not a function, is a different classifier. There is no sense in comparing softmax with SVM, because first one is a loss function second one is a classifier.

no update of parameters using multirootsolver

I am currently trying to write a code to solve a non linear system of equations. I am using the functions of the gsl library, more specifically the multiroot_fdf_solver.
My problem is that it currently doesn't want to converge. More specifically, I have the following behavior:
-if my initial conditions are close to the result, the gsl_multiroot_fdf_solver_iterate does not update the parameters at all. I tried to display the results on the different steps, and I have for all the parameters dx = NaN (I think this quite srange), the status of gsl_multiroot_fdf_solver_iterate is "success" and the status of gsl_multiroot_test_residual is "the iteration has not converged yet"
-the parameters are only updated if my initial conditions are really far from the expected result. Obvisously in this case it does not converge to the right values.
I have already checked multiple times the expression of my function and my Jacobian, and they seem good.
I have to precise that my Jacobian (and my system as well) are quite complicated expression with many trigonometric function.
Would you have any idea of what it could be? Is it possible that if the expression of the Jacobian is too complicated, it has troubles to compute it?
Thank you in advance for your answers, I am really stucked at this point.

mvnpdf vs regular normal(gaussian) PDF - matlab / C

I'm performing gaussian mixture model classification, and based on that, used "mvnpdf" function in MATLAB.
As far as I know the function returns a multi variate probability density for the data points or elements passed to it.
However I'm trying to recreate it on C and I assumed that mvnpdf is the regular Gaussian distribution (clearly it is not) because the results don't match.
Does anyone know how "mvnpdf" works ? Because I haven't been able to find documentation on it .
The documentation for mvnpdf is here
if you are looking for the exact code just put a break point at the point where you call it and see how it works
Okay I actually found a decent link that explains in detail what's happening inside .
This might be a better link to look at - http://octave.sourceforge.net/statistics/function/mvnpdf.html

Maximizing value using shooting method

I have a nonlinear equation that uses an initial y'(a) and outputs a value y(b) such that y(b)=f(y'(a)), where f(x) is some function. The idea is that I'd like to be able to maximize y(b).
Typically, if I had a value for y(b), I could use the shooting or secant method. However I don't have that value. I was thinking I could use a loop to find the max value, but that is very inefficient. Anything better I could use?
*Edit: Also I do not have an explicit expression for f(x).
Thanks,
Mike

How does the bwarea function of matlab work and how can I implement it in C?

I want to understand the working of the bwarea function of MATLAB. Also I want to implement this function in C. Any idea about how to implement will be very helpful.
Also is there a substitute for bwarea in opencv?
Thanks.
The manual suggests to examine the pixels by 2-by-2 neighbours.
These neighbours can take 6 different patterns which I tried to picture here. Examining each and every 2-by-2 neighbour and calculating the total is how bwarea works.

Resources