Options to change the identified intent in Alexa fulfillment - alexa

My understanding is that Amazon ASK still does not provide:
The raw user input
An option for a fallback intent
An API to
dynamically add possible options from which Alexa can be better
informed to select an intent.
Is this right or am I missing out on knowing about some critical capabilities?
Actions on Google w/ Dialogflow provides:
raw user input for analysis: request.body.result.resolvedQuery
fallback intents:
https://dialogflow.com/docs/intents#fallback_intents
An APi to dynamically add user expressions (aka sample utterances): PUT
/intents/{id}
These tools provide devs with the ability to check to see if the identified intent is correct and if not fix it.
I know there have been a lot of questions asked previously, just a few here:
How to add slot values dynamically to alexa skill
Can Alexa skill handler receive full user input?
Amazon Alexa dynamic variables for intent
I have far more users on my Alexa skill than my AoG app simply because of Amazon's dominance to date in the market - but their experience falls short of a Google Assistant user experience because of these limitations. I've been waiting for almost a year for new Alexa capabilities here, thinking that after Amazon's guidance to not use AMAZON.LITERAL there would be improvements coming to custom slots. To date it still looks like this old blog post is still the only guidance given. With Google, I dynamically pull in utterance options from a db that are custom for a given user following account linking. By having the user's raw input, I can correct the choice of intent if necessary.
If you've wanted these capabilities but have had to move forward without them, what tricks do you have to get accurate intent handling with Amazon when you don't know what the user will say?
EDIT 11/21/17:
In September Amazon announced the Alexa Skill Management API (SMAPI) which does provide the 3rd bullet above.

Actually this should be better a comment but i write to less at stackoverflow to be able to comment. I am with you on all.
But Amazons Alexa has also a very big advance.
The intent Schema is seeming to directly influence the Voice to Text recognition. Btw. can someone confirm if this is correct?
At Google Home it seems not to be the case.
So matching of unusual names is even more complicated than at alexa.
And it sometimes just recognize absolute bullshit.
Not sure which I prefer currently.
My feeling is for small apps is Alexa much better, because it better match the Intent phrases when it has lesser choices.
But for large Intent schemas, it get really trouble and in my tests some of the intents were not matched at all correct.
Here the google home and action SDK wins, probably? Cause Speech to text seem to be done before and than a string pattern to intent schema matching is happening. So this is probably more robust for larger schemas?
To get something like an answer on your questions:
You can try to add as much as possible that can be said to a slot. And than match the result from the Alexa request to your database via Jaro winkler or some other string distance.
Was I tried for Alexa was to find phrases that are close to what the user say. And this i added as phrases to fill a slot.
So a module in our webpage was an intent in the schema. And Than I requested To say what exactly should be done in that module (this was the slot filling request). The Answer was the slot filling utterance.
For me that was slightly better working than the regulary intent schema. But it require more talking so i dont like it so much.

Let me go straight to answering your 3 questions:
1) Alexa does provide the raw input via the slot type AMAZON.Literal but it's now deprecated and you're advised to use AMAZON.SearchQuery for free form capture. However, if instead of using SearchQuery you define a custom slot type and provide samples (training data) the ASR will work better.
2) Alexa supports FallbackIntent since I believe May 2018. The way it work is by automatically generating a model for your skill where out-of-domain requests are routed through a fallback intent. It works well
3) Dynamically adding slot type values is not feasible since when you provide samples you're really providing training data for a model than will be able to then process similar values beyond the ones you defined. If you noticed when you provide a voice interaction model schema then you have to build the model (in this step the training data provided in the samples is used to create the model). One example, when you define a custom slot of type "Car" and you provide the samples "Toyota", "Jeep", "Chevrolet" and "Honda" then, the system will also go to the same intent if the user says "Ford"
Note: SMAPI does allow to get and update the interaction model, so technically you could download the model via API, modify it with new training data, upload it again and rebuild the model. This is kind of awkward though

Related

How can I make Watson assistant repeat the last response?

I would like to have my watson assistant repeat the response that was given before.
I have an Intent that matches e.g. "Can you please repeat that?".
Then if that Intent is matched, the answer should be the same that was given before.
Is there a way to archieve that?
How can I access the answer from before in the expression language?
Unfortunately there's no easy way to do this within Watson Assistant itself. The most straightforward way would be to configure each dialog node to store the value of output.generic.text in a context variable, and then you could use that variable as needed. But if you have many dialog nodes, that could be a tedious task and pose some maintenance headaches.
Another approach would be to try to tackle this at the application layer, which would have some advantages. If your application could catch requests to repeat, then you could handle those (caching the previous turn's dialog response and just repeating it upon request). The advantage to this approach is that by not handling a repetition request within Watson Assistant itself, you won't be interfering with the current session state -- this could be especially useful if the user was in the middle of providing prompted information for a slot, for instance. You wouldn't need to manage this as a digression or anything like that.
But if you aren't able to handle this at the application layer, there aren't any really great options. I think it would be great if IBM considered adding a global response repeat function at the Assistant level (would be especially great for Voice Agent / Voice Gateway applications).

Creating different response per device

Is it possible for alexa user to have different responses based on config in the app. For example my skill is returning measurements. Some users may prefer metric and others imperial. I'd like users to be able to specify this (and may be some other things) to give a personalised experience. Can this be configured in the Amazon Alexa app?
I was thinking I might have to have some persistent storage for this (DDB for example) which would mean the app would write to the DDB and the skill would read from it to get the personalised response.
Thanks
Can this be configured in the Amazon Alexa app?
Unfortunately not in the way which you seem to be suggesting.
If you really wanted users to set preferences through the app, this could be done through account linking. However, it is generally discouraged (Alexa is meant to be "Voice-First") and likely to present additional obstacles if what you're wanting to do is allow users to set preferences for different devices.
However, using persistent storage for user preferences in generally is a good idea and as you've suggested, DynamoDB can do this.
If you take this approach you could ask users what their preferences are the first time they use a skill on a device and store this together with the device ID.
There is some good information about device ID in the Amazon documentation and some helpful tips here:
Get unique device id for every amazon echo devices

how to prepare data for domain specific chat-bot

I am trying to make a chatbot. all the chatbots are made of structure data. I looked Rasa, IBM watson and other famous bots. Is there any ways that we can convert the un-structured data into some sort of structure, which can be used for bot training? Let's consider bellow paragraph-
Packaging unit
A packaging unit is used to combine a certain quantity of identical items to form a group. The quantity specified here is then used when printing the item labels so that you do not have to label items individually when the items are not managed by serial number or by batch. You can also specify the dimensions of the packaging unit here and enable and disable them separately for each item.
It is possible to store several EAN numbers per packaging unit since these numbers may differ for each packaging unit even when the packaging units are identical. These settings can be found on the Miscellaneous tab:
There are also two more settings in the system settings that are relevant to mobile data entry:
When creating a new item, the item label should be printed automatically. For this reason, we have added the option ‘Print item label when creating new storage locations’ to the settings. When using mobile data entry devices, every item should be assigned to a storage location, where an item label is subsequently printed that should be applied to the shelf in the warehouse to help identify the item faster.
how to make the bot from such a data any lead would be highly appreciated. Thanks!
is this idea in picture will work?just_a_thought
The data you are showing seems to be a good candidate for a passage search. Basically, you would like to answer user question by the most relevant paragraph found in your training data. This uses-case is handled by Watson Discovery service that can analyze unstructured data as you are providing and then you can query the service with input text and the service answers with the closest passage found in the data.
From my experience you also get a good results by implementing your own custom TF/IDF algorithm tailored for your use-case (TF/IDF is a nice similarity search tackling e.g. the stopwords for you).
Now if your goal would be to bootstrap a rule based chatbot using these kind of data then these data are not that ideal. For rule-based chatbot the best data would be some actual conversations between users asking questions about the target domain and the answers by some subject matter expert. Using these data you might be able to at least do some analysis helping you to pinpoint the relevant topics and domains the chatbot should handle however - I think - you will have hard time using these data to bootstrap a set of intents (questions the users will ask) for the rule based chatbot.
TLDR
If I would like to use Watson service, I would start with Watson Discovery. Alternatively, I would implement my own search algorithm starting with TF/IDF (which maps rather nicely to your proposed solution).

Anything wrong with this BDD spec?

I'm new to Behavior Driven Design (BDD) so I'd like feedback on whether I've applied it correctly.
The feature I'm about to develop exports a list of Twitter handles ("authors") from a data repository and sends it as a CSV email attachment to the requestor. The data repository is Google App Engine's data store.
Reviewing Dan North's main description of Behavior Driven Testing: http://dannorth.net/introducing-bdd/, here is how I would specify my test:
Title: Analyst exports authors
As an analyst
I want to export authors
So I can analyze what they talk about in another system.
Scenario 1: Segment contains authors and export can be sent
Given the segment contains authors
And the user has provided their email address in their user profile
When the user clicks export for that segment
Then ensure the user receives the authors as a CSV attachment to an email.
Scenario 2: Segment contains authors but export can't be sent
Given the segment contains authors
And the user has not provided their email address in their user profile
When the user clicks export for that segment
Then remind the user to set their email address in their user profile first.
Scenario 3: Segment contains no authors
Given the segment contains no authors
When the user clicks export for that segment
Then prevent the user from clicking the export button
And display a message that the segment has no authors to export.
A few questions:
What other scenarios should I consider or have I scoped this story too narrowly?
Should Scenario 3 be part of a different story? Dan North says the scenarios should share the same event. However, the user experience would probably be such that the user could not click the export button because it's disabled. Should I have written the event differently so that it would have fit all scenarios, for example, "When the user goes to export the segment".
Is there anything else that would make this a better BDD test?
Any tips for how to implement a test like that depends on other systems such as the Google App Engine data store and email systems? Should I stub the data store? How do I test that an email attachment is received?
One event per story?
We'd probably say these days that the scenarios should be about the same capability; in this case, the capability to export authors via email. Most of them will have an event that performs the export, but anything else related to that capability belongs with it too.
I normally make the default case successful, i.e.:
When the user exports the segment
And use tries for when it fails:
When the user tries to export the segment
However, in the case that the export button is disabled, there's probably something else going on. What's the trigger for the user seeing that message that there are no authors? It would probably be something like:
Given there are no authors for a segment
When the user views that segment
Then they should be told that there are no authors for that segment.
How could you improve the scenarios?
Avoid mixing the UI domain in with your Twitter app domain. Talk about "when a user exports that segment" rather than the button-click. Perhaps one day it will be a drag-and-drop on a touch-screen, rather than a button-click. If perhaps you decide that way works better, and today is the day, you can change it and your scenarios will still be valid. Apart from the button-click your scenarios are really good; better than average, certainly.
In BDD we try to avoid using the word "test" unless we're really talking about testing, and these are just scenarios, or examples, which happen to provide tests as a nice by-product. Call them examples or scenarios and you'll find it easier to talk to business people about it, and to think of more examples yourself. I've also written a blog post on a couple of patterns I use for spotting missing scenarios, which might help you. Testers are really good at spotting these, so bring one into the conversation if you can. If you can't, try and find another analyst or business expert to put that hat on.
How many scenarios are in a story?
A story is just a slice through a feature for faster feedback, and the number of scenarios you might have in it is largely arbitrary and dependent on how easy the team find these things to code. Focus on getting feedback from a stakeholder at the end of a sprint, and make the stories small enough that you can get that feedback internally quickly, and large enough that you can show the stakeholders something interesting. I know teams who code entire features (and are happy showcasing part-way through), and teams who make each scenario a separate story, and teams who do something in between (all the successful exports, then all the edge-cases, for instance). All these approaches are valid.
How to include dependent systems?
You can either mock these out (by writing your own stub system, or by using a mocking framework or for those unlucky souls using SOAP, SOAP UI, etc.) or you can use the real system. In your case, I'd be tempted to deploy to a test version of the real system, and set up a test email domain to which I could send the emails (or stub that out, as above). There are lots of ways to access email inboxes via APIs, or you could always automatically export them to a file or database, etc.
Do whatever allows you to get fast feedback on your code, remembering that any system you don't automatically test will have to be manually verified, that things like email gateways are unlikely to change (and therefore unlikely to have bugs), and that BDD is not a substitute for manually testing at least once anyway.
The scenarios you've given do two things
Explore the happy path: the authors can be exported
Start exploring sad paths: the authors can't be exported for various reasons
In general I would concentrate on making the happy path as clean and simple as possible, before dealing with sad paths, otherwise you end up doing to much at once.
Currently you are doing to much at once and you have a number of anomalies in your scenarios. Lets look at a couple of them:
You've talked about 'segment' in your scenarios, but I have no idea what a segment is and why it should have authors. It sounds like its an implementation detail, and doesn't belong.
You've specifically said that the report should be a csv sent to an email addresses. Why are you choosing this mechanism? Wouldn't it be easier to just let the analyst download the CSV?
What I'd suggest is that you examine the happy path and go through the language with a fine toothed comb trying to remove all these sorts of anomalies, so you can get to the essence of the problem. This might end up being as simple as
Given there are some authors
When I ask for an export of authors
Then I should get a list of authors
From this you can then add stuff specific to your context. Lets say that only analysts can do this. Then we might get
Given I am an analyst
And their are some authors
When ...

How does Zapier/IFTTT implement the triggers and actions for different API providers?

How does Zapier/IFTTT implement the triggers and actions for different API providers? Is there any generic approach to do that, or they are implemented by individual?
I think the implementation is based on REST/Oauth, that is generic from high level to see. But for Zapier/IFTTT, it defines a lot of trigger conditions, filters. These conditions, filters should be specific to different provider. Is the corresponding implementation in individual or in generic? If in individual, there must be a vast labor force. If in generic, how to do that?
Zapier developer here - the short answer is, we implement each one!
While standards like OAuth make it easier to reuse some of the code from one API to another, there is no getting around the fact that each API has unique endpoints and unique requirements. What works for one API will not necessarily work for another. Internally, we have abstracted away as much of the process as we can into reusable bits, but there is always some work involved to add a new API.
PipeThru developer here...
There are common elements to each API which can be re-used, such as OAuth authentication, common data formats (JSON, XML, etc). Most APIs strive for a RESTful implementation. However, theory meets reality and most APIs are all over the place.
Each services offers its own endpoints and there are no commonly agreed upon set of endpoints that are correct for given services. For example, within CRM software, its not clear how a person, notes on said person, corresponding phone numbers, addresses, as well as activities should be represented. Do you provide one endpoint or several? How do you update each? Do you provide tangential records (like the company for the person) with the record or not? Each requires specific knowledge of that service as well as some data normalization.
Most of the triggers involve checking for a new record (unique id), or an updated field, most usually the last update timestamp. Most services present their timestamps in ISO 8601 format which makes parsing timestamp easy, but not everyone. Dropbox actually provides a delta API endpoint to which you can present a hash value and Dropbox will send you everything new/changed from that point. I love to see delta and/or activity endpoints in more APIs.
Bottom line, integrating each individual service does require a good amount of effort and testing.
I will point out that Zapier did implement an API for other companies to plug into their tool. Instead of Zapier implementing your API and Zapier polling you for data, you can send new/updated data to Zapier to trigger one of their Zaps. I like to think of this like webhooks on crack. This allows Zapier to support many more services without having to program each one.
I've implemented a few APIs on Zapier, so I think I can provide at least a partial answer here. If not using webhooks, Zapier will examine the API response from a service for the field with the shortest name that also includes the string "id". Changes to this field cause Zapier to trigger a task. This is based off the assumption that an id is usually incremental or random.
I've had to work around this by shifting the id value to another field and writing different values to id when it was failing to trigger, or triggering too frequently (dividing by 10 and then writing id can reduce the trigger sensitivity, for example). Ambiguity is also a problem, for example in an API response that contains fields like post_id and mesg_id.
Short answer is that the system makes an educated guess, but to get it working reliably for a specific service, you should be quite specific in your code regarding what constitutes a trigger event.

Resources