max classes in Visual Recognition custom classifier? - ibm-watson

The demo shows a custom classifier for dog breeds using 4 or 5 breeds. There are 340 or so dog breeds, can a classifier be trained with all 340 breeds?
There are 28,000 kinds of fish...
Is there a limit? Can I train a million classes in a classifier?

There is a practical limit of about 5000 custom classes in the current implementation. This is approximate because it depends in part on how complex each custom class is, how many classes and how many classifiers (which contain classes) you have. We have tested correct functioning of the system with more than twice this number, but we think users will probably start experiencing error codes due to timeouts if certain caches are not populated when a request is made around this number. Retrying the request can help to populate the caches.
This is a guideline; we are continuously working on improving Watson Visual Recognition to make it faster and solve bigger problems for our users.
Thanks for your question!

Related

Gatling: For long term should we put one REST API in one simulation?

We have decided to use Gatling as performance testing tool. But we are not able to figure how should we keep our simulations. Since we have around 25 API's as of now. So do I need to create 25 simulation for each API. This question is more on maintenance of code.
It depends on what you want to achieve. Do you want to only test each API in isolation? Or do you want to simulation real traffic that would be distributed over the different API calls with different weights?
I recommend extracting the calls to dedicated classes like it's being done in the tutorials and have multiple Simulations depending on your needs.

How to use Cloud ML Engine for Context Aware Recommender System

I am trying to build Context Aware Recommender System with Cloud ML Engine, which uses context prefiltering method (as described in slide 55, solution a) and I am using this Google Cloud tutorial (part 2) to build a demo. I have split the dataset to Weekday and Weekend contexts and Noon and Afternoon contexts by timestamp for purposes of this demo.
In practice I will learn four models, so that I can context filter by Weekday-unknown, Weekend-unknown, unknown-Noon, unknown-Afternoon, Weekday-Afternoon, Weekday-Noon... and so on. The idea is to use prediction from all the relevant models by user and then weight the resulting recommendation based on what is known about the context (unknown meaning, that all context models are used and weighted result is given).
I would need something, that responds fast and it seems like I will unfortunately need some kind of middle-ware if I don't want do the weighting in the front-end.
I know, that AppEngine has prediction mode, where it keeps the models in RAM, which guarantees fast responses, as you don't have to bootstrap the prediction models; then resolving the context would be fast.
However, would there be more simple solution, which would also guarantee similar performance in Google Cloud?
The reason I am using Cloud ML Engine is that when I do context aware recommender system this way, the amount of hyperparameter tuning grows hugely; I don't want to do it manually, but instead use the Cloud ML Engine Bayesian Hypertuner to do the job, so that I only need to tune the range of parameters one to three times per each context model (with an automated script); this saves much of Data Scientist development time, whenever the dataset is reiterated.
There are four possible solutions:
Learn 4 models and use SavedModel to save them. Then, create a 5th model that restores the 4 saved models. This model has no trainable weights. Instead, it simply computes the context and applies the appropriate weight to each of the 4 saved models and returns the value. It is this 5th model that you will deploy.
Learn a single model. Make the context a categorical input to your model, i.e. follow the approach in https://arxiv.org/abs/1606.07792
Use a separate AppEngine service that computes the context and invokes the underlying 4 services, weighs them and returns the result.
Use an AppEngine service written in Python that loads up all four saved models and invokes the 4 models and weights them and returns the result.
option 1 involves more coding, and is quite tricky to get right.
option 2 would be my choice, although it changes the model formulation from what you desire. If you go this route, here's a sample code on MovieLens that you can adapt: https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/movielens
option 3 introduces more latency because of the additional network overhead
option 4 reduces network latency from #3, but you lose the parallelism. You will have to experiment between options 3 and 4 on which provides better performance overall

logic determining dialog in Watson assistant

I want to improve ibm’s Watson assistant results.
So, I want to know the algorithm to determine a dialog in Watson assistant’s conversations.
Is it a svm algorithm?
A paper is welcome.
There are a number of ML/NLP technologies under the covers of Watson Assistant. So it's not just a single algorithm. Knowing them is not going to help you improve your results.
I want to improve ibm’s Watson assistant results.
There are a number of ways.
Representative questions.
Focus on getting true representative questions from the end users. Not only in the language that they use, but if possible from the same medium you plan to use WA on (eg. Mobile device, Web, Audio).
This is the first factor that reduces accuracy. Manufacturing an intent can mean you build an intent that a customer may never ask (even if you think they do). Second you will use language/terms with similar patterns. This makes it harder for WA to train.
Total training questions
It's possible to train an intent with one question, but for best results 10-20 example questions. Where intents are close together then more examples are needed.
Testing
The current process is to create what is called a K-Fold Cross validation (sample script). If your questions are representative then the results should give you an accurate indicator of how well it is performing.
However, it is possible to overfit the training. So you should use a blind set. This is a 10-20% of all questions (Random sample). They should never be used to train WA. Then run them against the system. Both your Blind + K-Fold should fall within 5% of each other.
You can look at the results of the K-Fold to fix issues, but blind set you should not. Blinds can go stale as well. So try to create a new blind set after 2-3 training cycles.
End user testing.
No matter how well your system is trained, I can guarantee you that new things will pop up when put in front of end users. So you should plan to have users test before you put it into production.
When getting users to test, ensure they understand the general areas it has been trained on. You can do this with user stories, but try not to prime the user into asking a narrow scoped question.
Example:
"Your phone is not working and you need to get it fixed" - Good. They will ask questions you will never have seen before.
"The wifi on your phone is not working. Ask how you would fix it". - Bad. Very narrow scope and people will mention "wifi" even if they don't know what it means.

AWS Sagemaker custom user algorithms: how to take advantage of extra instances

This is a fundamental AWS Sagemaker question. When I run training with one of Sagemaker's built in algorithms I am able to take advantage of the massive speedup from distributing the job to many instances by increasing the instance_count argument of the training algorithm. However, when I package my own custom algorithm then increasing the instance count seems to just duplicate the training on every instance, leading to no speedup.
I suspect that when I am packaging my own algorithm there is something special I need to do to control how it handles the training differently for a particular instance inside of the my custom train() function (otherwise, how would it know how the job should be distributed?), but I have not been able to find any discussion of how to do this online.
Does anyone know how to handle this? Thank you very much in advance.
Specific examples:
=> It works well in a standard algorithm: I verified that increasing train_instance_count in the first documented sagemaker example speeds things up here: https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-train-model-create-training-job.html
=> It does not work in my custom algorithm. I tried taking the standard sklearn build-your-own-model example and adding a few extra sklearn variants inside of the training and then printing out results to compare. When I increase the train_instance_count that is passed to the Estimator object, it runs the same training on every instance, so the output gets duplicated across each instance (the printouts of the results are duplicated) and there is no speedup.
This is the sklearn example base: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb . The third argument of the Estimator object partway down in this notebook is what lets you control the number of training instances.
Distributed training requires having a way to sync the results of the training between the training workers. Most of the traditional libraries, such as scikit-learn are designed to work with a single worker, and can't just be used in a distributed environment. Amazon SageMaker is distributing the data across the workers, but it is up to you to make sure that the algorithm can benefit from the multiple workers. Some algorithms, such as Random Forest, are easier to take advantage of the distribution, as each worker can build a different part of the forest, but other algorithms need more help.
Spark MLLib has distributed implementations of popular algorithms such as k-means, logistic regression, or PCA, but these implementations are not good enough for some cases. Most of them were too slow and some even crushed when a lot of data was used for the training. The Amazon SageMaker team reimplemented many of these algorithms from scratch to benefit from the scale and economics of the cloud (20 hours of one instance costs the same as 1 hour of 20 instances, just 20 times faster). Many of these algorithms are now more stable and much faster beyond the linear scalability. See more details here: https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
For the deep learning frameworks (TensorFlow and MXNet) SageMaker is using the built-in parameters server that each one is using, but it is taking the heavy lifting of the building the cluster and configuring the instances to communicate with it.

Silverlight binary / faster serialization

We currently use Silverlight 4 with WCF services and try to read large arrays of users objects from service. In our code it takes about 0.5 (and less) seconds to generate 700 objects, arranged by hierarchy (a lot of loops).
And it takes about 4-5 seconds for Silverlight/WCF to communicate that data - on localhost.
I've measured timings in my code / service call, used Fiddler to see data (5MBs!), and when I tried to pass a simplified object with plain attributes (instead of nested lists, etc), it took much less amount of data and was very quick - like, a second.
I've read many articles on the subject - there's no simple way, the best I could find is to return byte[] from WCF method (and have types in the separate assembly), or highly manual serializers (like protobuf) that require to write custom attributes, etc.
OK I tried those. protobuf-net is extremely hard (adding numbers to 200 existing classes isn't fun), and v2 is not here yet, and binaryMessageEncoding reduced data load from 5.5MB to 4.5MB, not too much.
But, I can't believe, is there any out-of-the-box WCF/Silverlight solution to stream large amounts of data? Isn't it supposed to be a nice modern technology for enterprise solutions?
How do I tell Silverlight/WCF to stream my data faster and smaller, rather than 5MBs in 5 seconds? Can I just say in config: "use small and fast serializer"?
I've found the SharpSerializer package very easy to use for fast binary serlization in Silverlight: http://www.sharpserializer.com/en/index.html. The resulting serialized data size is much smaller than using the DataContract serializer or other text-based serializers.
Does IIS have compression enabled. This will however impact CPU and you might need to double check if silverlight honors the deflate http header?

Resources