Dlib Shape Predictor - face-detection

The default dlib shape predictor (which predicts 68 landmark points on face) is the model namely "shape_predictor_68_face_landmarks.dat.bz2" which is trained on relatively smaller dataset.
I wonder if someone has trained the model with a larger dataset and has made the model publicly available?
TIA.

Nothing off-the-shelf. Also keep in mind that the 68-point dataset has a non-commercial license, so you might have to create your own model anyway.
The Helen dataset might be a good alternative. Somebody was working on a dlib predictor for that: DLIB : Training Shape_predictor for 194 landmarks (helen dataset)

Related

How Does Adaboost Work with Viola and Jones Algorithm?

I am implementing a functional face detection algorithm in C using Viola and Jones algorithm. I'm having trouble understanding Adaboost to train a strong classifier.
I can detect all 5 basic haar-features in a single image (162336 in a 24x24 image) I'm pretty sure this is good and working, and my algorithm outputs and array containing all the features sorted.
Then, I started working on Adaboost and here's what I understand. We create a weak classifier (slightly better than random) and we make a linear combination of many weak classifier (approx 200) to get a strong classifier.
What I don't understand is how to create this weak classifier. From what I read online:
Normalize the weights of our training examples (first round 1 by default)
Then get a feature (here's one of my problem, do I have to process each feature of each training example ? (162336 * number of examples) that would be a lot of computing power no ? )
"Apply" this feature to each image to get an optimal treshold and toggle (here's my main problem, I don't understand what "apply" means here, compare it with each feature of the image ? I really don't see what I have to do with it. Then, I don't understand what is the treshold and the toggle and that's where i'm looking for help)
Then many more other things to do
I'm really looking forward your help to make me understand this!
Should have answered my own question faster, but I've forgot about it. It was a project for my computer science school so I can provide answers.
Adaboost is in fact fairly simple when you understand it.
First you need to detect features inside every images in your base (we used 4000 images to have a large set) you can store them if you have enough memory or process them when you need them in you program. For 4000 images with 5 haar features inside we used more than 16Gb of RAM (Code was written in c, but no memory leak, it was arrays of double)
The training algorithm assign a weight to an image. That weight represent the difficulty for the algorithm to make a good prediction (face or no face).
Your training algorithm will be composed of rounds (200 rounds is fine to have 90%+ of good prediction).
At the first round every image possess the same weight because the algorithm never worked on them.
Here is how a round goes:
Find the best haar feature among X (for each type) in each image. To do this, compare each feature to the same one (same type, dimension and position) on every image and see if it is a good or bad prediction. The feature with the best prediction inside the X features is the best one, keep it stored.You will find 5 best features per images because there is 5 type combine them in a single struct and it is your weak classifier
Calculate the weighted error of the classifier. The weighted error is the error of the weak classifier applied to each image while taking in account the weight assigned to each image. In later rounds, the image with a bigger weight (the algorithm made lot of mistakes about this image) will be taken much more into account.
Add the weak classifier to a strong classifier (which is an array of weak classifiers) and its alpha. The alpha is the performance of the weak classifier and is determined with the weighted error. With the alpha, weak classifier which were made at later stage of the algorithm when the training is harder will have more weight in the final prediction of the strong classifier.
Update the weight of each image according to the prediction of the weak classifier you just created. If the classifier is right the weight goes down otherwise it goes up.
At the end of the 200 rounds of training, you will possess a strong classifier composed of 200 weak classifier. To make a prediction about a single image, apply each weak classifier to the image and the majority wins.
I voluntarily simplified the explanation but the majority is here. For more informations look here, it really helped me during my project: http://www.ipol.im/pub/art/2014/104/article.pdf
I suggest every person interested in AI and optimisation to work on a project like that. As a student it made me really interested in AI and made me think a lot about optimisation, which I never did before.

Threshold values for viola jones object detection

I am trying to perform Adaboost training stated by Viola and Jones in their paper on rapid object detection. However, I do not understand how to get the threshold values that will classify the faces from non faces for each of the 160k features. Is this a threshold you set manually? or is this based on some kind of maths ?
Can someone please explain the maths to me thanks a lot.
IMO, the best way to describe what happens during threshold assignment of the weak classifiers in every boosting round is a ROC analysis of the weak classifier performance. A great introduction on ROC analysis was written by Tom Fawcett. The full algorithm that does what you want is described in Shappire and Freund`s book, section 3.4.2.

Use case for incremental supervised learning using apache mahout

Business case:
Forecasting fuel consumption at site.
Say fuel consumption C, is dependent on various factors x1,x2,...xn. So mathematically speaking, C = F{x1,x2,...xn}. I do not have any equation to put this.
I do have historical dataset from where I can get a correlation of C to x1,x2 .. etc. C,x1,x2,.. are all quantitative. Finding out the correlation seems tough for a person like me with limited statistical knowledge, for a n variable equation.
So, I was thinking of employing some supervised machine learning techniques for the same. I will train a classifier with the historic data to get a prediction for the next consumption.
Question: Am I thinking in the right way?
Question: If this is correct, my system should be an evolving one. So the more real data I am going to feed to the system, that would evolve my model to make a better prediction the next time. Is this a correct understanding?
If the above the statements are true, does the AdaptiveLogisticRegression algorithm, as present in Mahout, will be of help to me?
Requesting advises from the experts here!
Thanks in advance.
Ok, correlation is not a forecasting model. Correlation simply ascribes some relationship between the datasets based on covariance.
In order to develop a forecasting model, what you need to peform is regression.
The simplest form of regression is linear univariate, where C = F (x1). This can easily be done in Excel. However, you state that C is a function of several variables. For this, you can employ linear multivariate regression. There are standard packages that can perform this (within Excel for example), or you can use Matlab, etc.
Now, we are assuming that there is a "linear" relationship between C and the components of X (the input vector). If the relationship were not linear, then you would need more sophisticated methods (nonlinear regression), which may very well employ machine learning methods.
Finally, some series exhibit auto-correlation. If this is the case, then it may be possible for you to ignore the C = F(x1, x2, x3...xn) relationships, and instead directly model the C function itself using time-series techniques such as ARMA and more complex variants.
I hope this helps,
Srikant Krishna

Machine learning, best technique

I am new to machine learning. I am familiar with SVM , Neural networks and GA. I'd like to know the best technique to learn for classifying pictures and audio. SVM does a decent job but takes a lot of time. Anyone know a faster and better one? Also I'd like to know the fastest library for SVM.
Your question is a good one, and has to do with the state of the art of classification algorithms, as you say, the election of the classifier depends on your data, in the case of images, I can tell you that there is one method called Ada-Boost, read this and this to know more about it, in the other hand, you can find lots of people are doing some researh, for example in Gender Classification of Faces Using Adaboost [Rodrigo Verschae,Javier Ruiz-del-Solar and Mauricio Correa] they say:
"Adaboost-mLBP outperforms all other Adaboost-based methods, as well as baseline methods (SVM, PCA and PCA+SVM)"
Take a look at it.
If your main concern is speed, you should probably take a look at VW and generally at stochastic gradient descent based algorithms for training SVMs.
if the number of features is large in comparison to the number of the trainning examples
then you should go for logistic regression or SVM without kernel
if the number of features is small and the number of training examples is intermediate
then you should use SVN with gaussian kernel
is the number of features is small and the number of training examples is large
use logistic regression or SVM without kernels .
that's according to the stanford ML-class .
For such task you may need to extract features first. Only after that classification is feasible.
I think feature extraction and selection is important.
For image classification, there are a lot of features such as raw pixels, SIFT feature, color, texture,etc. It would be better choose some suitable for your task.
I'm not familiar with audio classication, but there may be some specturm features, like the fourier transform of the signal, MFCC.
The methods used to classify is also important. Besides the methods in the question, KNN is a reasonable choice, too.
Actually, using what feature and method is closely related to the task.
The method mostly depends on problem at hand. There is no method that is always the fastest for any problem. Having said that, you should also keep in mind that once you choose an algorithm for speed, you will start compromising on the accuracy.
For example- since your trying to classify images, there might a lot of features compared to the number of training samples at hand. In such cases, if you go for SVM with kernels, you could end up over fitting with the variance being too high.
So you would want to choose a method that has a high bias and low variance. Using logistic regression or linear SVM are some ways to do it.
You could also use different types of regularizations or techniques such as SVD to remove the features that do not contribute much to your output prediction and have only the most important ones. In other words, choose the features that have little or no correlation between them. Once you do this, you would be able to speed yup your SVM algorithms without sacrificing the accuracy.
Hope it helps.
there are some good techniques in learning machines such as, boosting and adaboost.
One method of classification is the boosting method. This method will iteratively manipulate data which will then be classified by a particular base classifier on each iteration, which in turn will build a classification model. Boosting uses weighting of each data in each iteration where its weight value will change according to the difficulty level of the data to be classified.
While the method adaBoost is one ensamble technique by using loss function exponential function to improve the accuracy of the prediction made.
I think your question is very open ended, and "best classifier for images" will largely depend on the type of image you want to classify. But in general, I suggest you study convulutional neural networks ( CNN ) and transfer learning, currently these are the state of the art techniques for the problem.
check out pre-trained models of cnn based neural networks from pytorch or tensorflow
Related to images I suggest you also study pre-processing of images, pre-processing techniques are very important to highlight some feature of the image and improve the generalization of the classifier.

How does the Blue Brain Project (and NEURON software) work?

This question is related to 873448.
From Wikipedia:
The Blue Brain Project is an attempt to create a synthetic brain by reverse-engineering the mammalian brain down to the molecular level. [...] Using a Blue Gene supercomputer running Michael Hines's NEURON software, the simulation does not consist simply of an artificial neural network, but involves a biologically realistic model of neurons.
"If we build it correctly it should speak and have an intelligence and behave very much as a human does."
My question is how the software works internally. If it "involves a biologically realistic model of neurons", how is that different from a neural network, and why can't neural networks simulate a biological brain well while this project would be able to? And, how is NEURON software used in the simulation?
Lastly, I apologize if this question doesn't belong here (maybe the BioStar StackExchance would be a better place to ask).
NEURON software models neuronal cells by modeling fluxes of ions inside and outside the cell through different ion channels. These movement generate a difference of electrical potential between the interior and the exterior of the neuronal membrane, and modulations of this potential allows different neurons to communicate between each other. Several biophysical models for neurons exist, such as the integrate-and-fire model or the Hodgkin-Huxley model
Artificial neural networks have pretty much nothing to do with biological neural networks, apart from sharing the same name. They're mathematical constructs that are connected with each other in a weighted manner, allowing to take one or more inputs and produce one or more outputs.
EDIT: I have to add, as much as the Blue Project is an incredible and very admirable step towards modeling an entire brain, we are far far far far away from that goal. All these are models, so they approximate the behaviour of biological cells, but they are in no way complete. Furthermore, there is a high bias in the "choice" of which neurons these models analyze. Most of the models represent certain areas of the brain (such as the cortex or the hippocampus) of which 1) we have quite a bit of knowledge and 2) are constituted by very organized structures of neuronal cells working together. Other parts of the brain may not be as trivial to model (note that I use "trivial" in a jokingly way, I'm not in any way saying that modeling the cortex is easy!), but I guess the details of this would be a bit outside the scope of SO. Maybe when the cognitive science proposal will be operative you could pose the question there!
Finally, to correct the quoted statement, the project did model a column of the somatosensory cortex of the rat, which is only a very tiny part of an entire rat brain.

Resources