A/B Test feature in SageMaker: variant assignment is random? - amazon-sagemaker

A/B test feature in SageMaker sounds so intriguing but the more I looked into, the more I am confused whether this is a useful feature. For this to be useful, you need to get the variant assignment data back and join with some internal data to figure out the best performing variant.
How is this assignment done? Is it purely random? Or am I supposed to pass some kind of ID (or hashed ID) which can indicate a person or a browser so that the same model is picked up for the same person.

For this to be useful, you need to get the variant assignment data back and join with some internal data to figure out the best performing variant.
The InvokeEndpoint response includes the "InvokedProductionVariant", in order to support the kind of analysis you describe. Details can be found in the API documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html#API_runtime_InvokeEndpoint_ResponseSyntax
How is this assignment done? Is it purely random?
Traffic is distributed randomly while remaining proportional to the weight of the production variant.

so that the same model is picked up for the same person
Amazon SageMaker does not currently support this type of functionality, which is a major blocker for using it on some A/B tests.
I created a thread in the AWS SageMaker forum asking for this functionality to be added: https://forums.aws.amazon.com/thread.jspa?threadID=290644&tstart=0

Related

What CMS is as close to "No-Code" as possible, allows custom fields and allows math type formulas to be applied?

In the past I have used Zoho Creator which worked well but now I need something with a far better front end and something self hosted so I have been trying to find a CMS that can do what Creator does. I am currently using WP Toolset but is a nightmare to do the calculations I need it to. Have tried PRocesswire but no front end. Does anyone know of a CMS that is easy to "Fetch" data from other tables and fields and then return an answer? Or another idea altogether?
I’m aware of a company who is doing just that. The app, delivery and all of it is in the cloud with management locked away multiple security U2F keys away form mere mortals. But the point is, this idea, to make it easier, backer resistant (never proof), and all that 1 it’s on the drawing hoard at some start ups. Not just one. Check into it. I wish I could give you more but I’m part of a team doing just that and it’s outside development that can caused either unseen bugs, too many classes or wrong type of classes, or otherwise screw with our once perfect baby. So we are in essence sandboxinv all developers and forking their repo, even going a step further and giving them a dev repo that’s formed from our actual repo in real world terms.

how to prepare data for domain specific chat-bot

I am trying to make a chatbot. all the chatbots are made of structure data. I looked Rasa, IBM watson and other famous bots. Is there any ways that we can convert the un-structured data into some sort of structure, which can be used for bot training? Let's consider bellow paragraph-
Packaging unit
A packaging unit is used to combine a certain quantity of identical items to form a group. The quantity specified here is then used when printing the item labels so that you do not have to label items individually when the items are not managed by serial number or by batch. You can also specify the dimensions of the packaging unit here and enable and disable them separately for each item.
It is possible to store several EAN numbers per packaging unit since these numbers may differ for each packaging unit even when the packaging units are identical. These settings can be found on the Miscellaneous tab:
There are also two more settings in the system settings that are relevant to mobile data entry:
When creating a new item, the item label should be printed automatically. For this reason, we have added the option ‘Print item label when creating new storage locations’ to the settings. When using mobile data entry devices, every item should be assigned to a storage location, where an item label is subsequently printed that should be applied to the shelf in the warehouse to help identify the item faster.
how to make the bot from such a data any lead would be highly appreciated. Thanks!
is this idea in picture will work?just_a_thought
The data you are showing seems to be a good candidate for a passage search. Basically, you would like to answer user question by the most relevant paragraph found in your training data. This uses-case is handled by Watson Discovery service that can analyze unstructured data as you are providing and then you can query the service with input text and the service answers with the closest passage found in the data.
From my experience you also get a good results by implementing your own custom TF/IDF algorithm tailored for your use-case (TF/IDF is a nice similarity search tackling e.g. the stopwords for you).
Now if your goal would be to bootstrap a rule based chatbot using these kind of data then these data are not that ideal. For rule-based chatbot the best data would be some actual conversations between users asking questions about the target domain and the answers by some subject matter expert. Using these data you might be able to at least do some analysis helping you to pinpoint the relevant topics and domains the chatbot should handle however - I think - you will have hard time using these data to bootstrap a set of intents (questions the users will ask) for the rule based chatbot.
TLDR
If I would like to use Watson service, I would start with Watson Discovery. Alternatively, I would implement my own search algorithm starting with TF/IDF (which maps rather nicely to your proposed solution).

How connect dialog flow to database

I want to store some data in the database. Then using those data I will answer the queries for the user using Dialog flow.
Any idea on implementing these
You will need to use a webhook to do fulfillment. In your webhook, you can make the database queries you want.
You may want to use an NLIDB (natural language interface to database). An NLIDB maps natural language questions over the database schema into SQL, solves such SQL queries and returns answers. Additional misconception and ambiguity resolution steps may be included.
NLIDBs are in contrast to dialog management systems (such as DialogFlow) which use interactive dialog to fill in slots for specific question types, and then execute these questions in specialized code. This specialized code may very well interact with a database, but it is relative to a specific question type so it is fairly straight forward to implement.
The advantage of NLIDBs however is that if the mapping tool is robust, a practically infinite number of questions may be answered over a complex database schema. The disadvantage is that the mapping tools are often sometimes less than robust. But this is an area under active R&D.
There are several companies currently offering NLIDB systems.
See for example: https://friendlydata.io/, http://c-phrase.com and http://kueri.me/.
AWS might be of help. I have some answers where I detail how to use API gateway for example, as a pseudo back-end so you can run this all from a front end ( or static ) page. DOing this, my hack would be to just write a JSON file or create a variable thats imported (key/vales) which would include your database info. I created a react page once where I used a long list of database data (SQL) which i just put in a json file and imported. worked great.
Of course if you have experience building a back end, you can figure all this out. if not, i would recommend looking into wix. They have a great platform, which you can use javascript in and it also has a node back end with access to node modules. they also have fully functional built in databases. good luck!

Using the IBM Watson Concept Insights service for Natural Language Search

We are trying to implement a natural language search function using the IBM Watson Cognitive Insights (CI) service. We want the user to be able to type in a question using natural language and then return the appropriate document(s) from a CI corpus. We are using CI rather than the Watson QA service to avoid the need for training and to keep Watson infrastructure costs down (i.e. avoid the need for a dedicated instance of Watson for each corpus/use case).
We are able to build the necessary corpus through the CI API but we are not sure which APIs to use in what order to accomplish the most precise/accurate query possible.
Our initial thought was to:
Accept the user’s natural language question and Post that text string to the “Identifies concepts in a piece of text” API (listed 6th from the bottom in the CI API Reference document) to get a list of concepts related to the question.
Then do a GET using the “Performs a conceptual search within a corpus” API (listed 3rd from the bottom in the CI API Reference document) to get a list of related documents back from the corpus.
The first question - is this the right way to go about achieving our objective described in the first paragraph of this post? Should we be combining the CI APIs differently or using multiple Watson services together to achieve the objective?
If our initial approach is the right one, then we are finding that when we submit a simple question (e.g. “How can I repair MySQL database corruption”) to the “Identifies concepts in a piece of text” API we are not getting a comprehensive list of associated concepts back. For example:
curl -u userid:password -k -d "How can I repair MySQL database corruption" https://gateway.watsonplatform.net/concept-insights-beta/api/v1/graph/wikipedia/en-20120601?func=annotateText
returns:
[{"concept":"/graph/wikipedia/en-20120601/MySQL","coords":[[17,22]],"weight":0.85504603}]
Yet clearly there are other concepts associated with the example question (repair, corruption, database, etc.).
In another example we just submitted the text “repair” to the “Identifies concepts in a piece of text” API:
curl -u userid:password -k -d "repair" https://gateway.watsonplatform.net/concept-insights-beta/api/v1/graph/wikipedia/en-20120601?func=annotateText
and it returned:
[{"concept":"/graph/wikipedia/en-20120601/Repair","coords":[[0,6]],"weight":0.65392953}]
It seems that we should have gotten back the “Repair” concept from the first example also. Why would the API return the “repair” concept when we submit "repair" but not when we submit the text “How can I repair MySQL database corruption” which also includes the word “repair.”
Please advise as to the best way to implement a natural language search function based on the Watson Concept Insights service (perhaps in combination with other services if appropriate).
Thank you very much for your question and my apologies for being so late in answering it.
The first question - is this the right way to go about achieving our objective >described in the first paragraph of this post? Should we be combining the CI >APIs differently or using multiple Watson services together to achieve the objective?
Doing the steps above would be a natural way to accomplish what you want to do. Please note however that the "annotate text" API uses currently exactly the same technology that the system has for connecting documents in your corpus to concepts in the core knowledge graph and as such, it is more "paragraph" oriented rather than individual question oriented. To be more precise, the problem of extracting concepts in a smaller piece of text is generally more difficult than in a larger piece of text because in the latter there is more context that can be used to make the right choices. Given this observation, the annotate text API goes the more conservative route again given its paragraph focus.
Having said that, the /v2 API that we now have does improve the speed and quality of the concept extraction technology, so it is possible that you would be more successful in using it in order to extract topics from natural language questions. Here's what I would do/watch out for:
1) Clearly display to the user what CI extracted from the natural language in the input. Our APIs give you a way to retrieve a little abstract per concept which can be used to explain to a user what a concept means - do use that.
2) Give the user the ability to eliminate a concept from the extracted concept list (strike it out)
3) Since the concepts in concept insights currently correspond roughly to the notion of "topics", there is no way to deduce more abstract intent (for example, if the key to the meaning of a question is on a verb or an adjective as opposed to a noun, concept insights would be a poor way to deduce it). Watson does have technology oriented towards question answering as you pointed out before (the natural language classifier being one component of that), so I would take a look at that.
Yet clearly there are other concepts associated with the example question >(repair, corruption, database, etc.).
The answer for this and the rest of the posted question is in a sense above - our intention was to provide a technology first for "larger text" which as I explained is an easier task. Since this question was first posted and today, we did introduce new annotation technology (/v2) so I would encourage the reader to see whether it performs a little better.
For the longer term, we do have the intention to give the user a formal way to specify context for a general application so that the chances of extraction of relevant concepts increase. We also have a plan to have the user be able to specify custom concepts, as it has been observed in the past that some topics of interest to users are impossible to match in our current design because they are not in wikipedia.

Genetic Algorithm vs Expert System

I'm having some doubts about which system should I use for a new software.
No code has been written yet, I'm just breaking apart all the needs and only then start coding.
This will be implemented in a computer company that provides services for other companies, onsite and remotely.
These are my variables:
Number of technicians
Location of customer
Type of problem
Services already scheduled for the technician
Expertise of the technician about the situation
Customer priority
Maybe some are missing, but these are the most important ones.
This job is being done manually, and has humans, we fail to see the best route to be taken sometimes.
Let's say that a customer calls with a printer problem.
First, check which tech knows about printers.
Then, is the tech available? far from the customer? can it be done remotely (software issues)?
Can it be done by another tech who is closer from the customer location?
Does this customer have more priority than the other where the same tech should be going?
Is the technician schedule full? If yes, pass to another printer/hardware tech.
I know my english is not perfect (not my natural language), but I'll try to provide more details or correct the text as needed.
So, my question is this, what kind of approach would you take? Genetic algorithm seems nice for this kind of job, and I also have some experience with GAF and WatchMaker (Java GA Framework). However, when reading the text above, an expert system seems also appropriate.
Have someone done something like this?!I had search for this kind of software and couldn't find anything alike.
Would another approach be better than the two asked?!
Also, I'm building up a table with all the techs capabilities and expertise, with simple rules like, 1 to 5 about each expertise. This is also a decision factor.
Thanks.
Why not do both? Use an expert system (a rule engine) to define your constraints and use a metaheuristic (such as Local Search or Genetic Algorithms) to solve it. The planning engine OptaPlanner (java, open source) does exactly that (by using the rule engine Drools). The architecture look likes this:
Here's a video demonstrating the constraint flexibility on the vehicle routing problem (VRP). Your problem seems to be an advanced variant on VRP (which is a variant on TSP).
Maybe you can start off with TSP,
here http://en.m.wikipedia.org/wiki/Travelling_salesman_problem
I guess it only deals with the distance.

Resources