Alexa Skill - How to get complete text of statement asked to alexa - alexa

I am creating an Alexa skill, I have coded several custom and default intents and they are working fine.
Now I want to write a fallback intent wherein I want to get the exact statement asked/sent to Alexa skill, is there a way wherein we may get the entire question string/text that has been asked to Alexa skill. I know we can get slot values and intent information, but I need the entire text statement sent to skill.
Thanks

Well, I had faced the same issue. After trying several methods, I have got the complete text of the statement asked Alexa.
You have to make the following setup in your Alexa skill (name of intent, slot name, and slot type you can choose as per your need)
Setting up Intent
Setting up custom slot type
After setting up your Alexa skill you can invoke your skill, keep some response for launch request and say anything you want, you can catch the entire text as shown here.
"intent": {
"name": "sample",
"confirmationStatus": "NONE",
"slots": {
"sentence": {
"name": "sentence",
"value": "hello, how are you?",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "xxxxxxx",
"status": {
"code": "xxxxxxx"
}
}
]
},
"confirmationStatus": "NONE",
"source": "USER"
}
}
}
Note*: In this method you will need to handle utterances properly if there are more than one intent.

There's no way to get the whole utterance straight from a top level intent. Right now the closest you can get is using a custom slot with type AMAZON.SearchQuery (not a custom type as suggested in another answer) but you will have to define an anchor phrase in your utterance that goes before the slot. For example, you would define an utterance like:
search {query}
where query is a slot of type AMAZON.SearchQuery.
The anchor search in the utterance is mandatory (a requirement of the SearchQuery type), so as long as the user starts the utterance by saying search, anything that follows will be captured which is pretty close to what you want to achieve.
Having said that there's actually one indirect way to approximate capturing the whole utterance the user is saying (filtered by NLU) leveraging AMAZON.SearchQuery but only as part of an ongoing dialog using Dialog Management. If you're engaging in a dialog of this kind where Alexa automatically uses defined prompts to solicit slot information you can define an utterance that is a single isolated slot of type AMAZON.SearchQuery with no anchor. Example:
Alexa: Ok, I will create a reminder for you. Please tell me the text of the reminder
User: Pick of the kids from school
Alexa: Ok. I will remind you to Pick up the kids from school
In the example above Alexa detects that the user wants to send a reminder but there's no reminder text set up so it elicits the slot. When you, as a developer, define the prompts that Alexa needs to ask you also define the possible reponses. In this case you can define a response utterance as just:
{query}
and capture the whole thing the user says in response to the prompt, like e.g. "pick up the kids from school"

The English US language has a Slot Type called AMAZON.LITERAL that lets you capture the exact phrase or sentence used (depending on how it's used in your utterance). This Slot Type, however, isn't available in other regions.
Amazon also don't recommend using it:
Although you can submit new and updated English (US) skills with
AMAZON.LITERAL, custom slot types provide better accuracy than
AMAZON.LITERAL in most cases. Therefore, we recommend that you
consider migrating to custom slot types if possible. Note that
AMAZON.LITERAL is not supported for any language other than English
(US).
See: https://developer.amazon.com/docs/custom-skills/literal-slot-type-reference.html

There once used to be a slot type called Amazon.LITERAL, that was allowed to be used in specific regions. However, it has now been either deprecated (or removed).
There is however another solution to this problem using custom slots.
Let's say we are creating a Food Ordering System on Alexa. A skill for something like Zomato or Yelp for Alexa. Let us give the skill the invocation name robert.
So first we make a list of the type of statements which are going to be made. You can skip this step if your skill isn't this specific. However, this just helps you define the type of statements your skill might expect to encounter.
Alexa order robert to send me a chicken steak with mashed potatoes.
Alexa ask robert to recommend me some good Indian restaurants near me.
Alexa please tell robert to rate Restaurant XYZ's recent delivery with a single star.
After we have made a list of statements we store them in a csv file.
We go ahead and click on the Add button beside Slot Types.
Give your Custom Slot Type a name.
Now once you are done with this, come up with the list of constructs in which your skill can be invoked. Some of them have been given below.
Alexa ask robert to ...
Alexa make robert ...
Alexa order robert to ...
Alexa tell robert to ...
The three dots (...) represent the actual part of the order/statement. This is the text which you are interested in extracting. An example would be; for the statement,
Alexa ask Robert to send me a bucket of chicken nuggets.
you would be interested in extracting only the portion in bold.
Now Amazon classifies statements based on intent. They have five default, predefined intents for Welcome, Cancelling, Help and other basic functionalities. We go ahead and create a custom intent for dealing with the mainstream statements that will be used to primarily interact with our skill.
Under the new Custom Intent Window, at the bottom of the page is the space to add slots which will be used in your intent. We add our previously created custom slot and name it literal. (You can name it anything)
The custom slot, literal in our case, is the text we want to be extracted out of the user's statements.
Now we go ahead and replace the three dots (...) in the list of constructs with {literal} and add it to the list of sample utterances.
For the statement
Alexa order robert to send me a chicken steak with mashed potatoes.
The JSON would contain a section like this for the custom intent and highlighting the custom slot text.
"request": {
"type": "IntentRequest",
"requestId": "",
"timestamp": "2019-01-01T19:37:17Z",
"locale": "en-IN",
"intent": {
"name": "InteractionIntent",
"confirmationStatus": "NONE",
"slots": {
"literal": {
"name": "literal",
"value": "to send me a chicken steak with mashed potatoes.",
"resolutions": {
"resolutionsPerAuthority": [
{
"authority": "",
"status": {
"code": ""
}
}
]
},
"confirmationStatus": "NONE",
"source": "USER"
}
}
}
}
Under the slots subsection under the custom intent we have our literal slot whose value gives us the text of the user's speech.
"slots": {
"literal": {
"name": "literal",
"value": "to send me a chicken steak with mashed potatoes."

Related

Accept any user input for Alexa Custom Skill, but include only part of that input in confirmation

I'm trying to build a custom Alexa skill for a gratitude diary. The goal is to have an interaction in which the Alexa device asks the user what they're grateful for, and repeats it back as confirmation.
I'm encountering a problem when it comes to repeating back what the user has said. I'd like the conversation to go like this:
Alexa: What are you grateful for today?
User I'm grateful for dogs
Alexa: You said you're grateful for dogs. Is that correct?
I've set this up as a single intent, as follows:
gratitude_object as a required slot, of type AMAZON.SearchQuery
user utterances for this slot are I'm grateful for {gratitude_object} (and a few variations)
confirmation message for this slot is You're feeling grateful for {gratitude_object}. Is that correct?
The problem I'm encountering is that when I test this model in the Utterance Profiler, it goes like this:
Alexa: What are you grateful for today?
User I'm grateful for dogs
Alexa: You said you're grateful for I'm grateful for dogs. Is that correct?
I'm guessing this is something to do with the fact that AMAZON.SearchQuery will accept anything as valid input, but I'm not sure how to go about resolving this.
I've also tried creating a custom slot for the I'm grateful for phrase:
slot name: gratitude_phrase_initiator
slot type: custom slot type
slot values: "I'm grateful for", "I am grateful for" (etc)
However, if I then try to use this slot in my intent, by making the user utterance for the gratitude_object slot:
{gratitude_phrase_initiator} {gratitude_object}
...then I get the following error:
Sample utterance "{gratitude_statement_initiator} {gratitude_object}" for slot "gratitude_object" in intent "NewEntryIntent" cannot include both a phrase slot and another intent slot. Error code: InvalidSlotSamplePhraseSlot
.
I'd really like to keep the interaction as it is currently, with the user starting by saying "I'm grateful for...". Any suggestions for how I could make this work using the interaction model, or is it just impossible? Is this something I could handle in the code instead of the interaction model?
Looks like you set it up perfectly but you would expect SearchQuery to do a better job of excluding the initial utterance phrase. So you'll have to parse it some more in code. You should use a Lambda Function to string replace the slot value to remove any initial phrases.
Example in Node.js:
var gratitude_object = this.event.request.intent.slots.gratitude_object.value;
var initial_phrases = [
"i'm grateful for",
"i am grateful for",
];
initial_phrases.forEach(function(value){
gratitude_object = gratitude_object.toLowerCase().replace(value,"");
});
Notice the array of intitial phrases are written in lowercase and the forEach loop also makes the slot value lowercase before checking to replace. That way you don't have to worry about case matching when writing the initial phrases you want to remove.

Using sessionAttributes in Alexa Skill

I am building an Alexa skill and not quite sure if I am using sessionAtrributes correctly. I know sessionAttributes are used to carry-forward a session's data to next invocation.
So I have these two intents
1) ListToDoItem
In this intent my skill will look into a database and list out the
to-do items stored in the database. After listing the items, Alexa
will go on to say "do you want me to list detailed info on these
to-do items?", to handle this I will pass the items retrieved in the
previous session as sessionAttributes. When asked to list detailed
info on the items, I will extract the previously forwarded
sessionAtrributes and compose a detailed speech response.
So for this intent I have to sample utterances
list my to-do items
yes
The utterance 'yes' will be used so that the sessionAttributes can be extracted to create a detailed speech response.
2) ListDoneItems
This intent will be used to list out completed items. It is similar to the previous intent, the only difference being, this intent will list out completed items.
For this intent will have 2 sample utterances
list my completed items
yes
Like before it has an 'yes' to generate a detailed speech response based on the sessionAttributes.
But the problem I have is that when I reply 'yes' to the ListDoneItems intent's 'Do you want me to list the completed items'?, the next intent request generated is of type ListToDoItems instead of ListDoneItems, even though I have set shouldEndSession to false in my skill response. This is happening because there is a crossover between sample utterances between my intents. So is having similar intents in different intents wrong? How to design interaction model to create a multi-turn dialog in order to use in sessionAttributes?
I think this will be of use to someone searching for answer.
Basically in the sample utterance you should not include phrases for your re-prompts; i.e. in my case I should not add 'yes' as an utterance. Instead I should be using Amazon.YesIntent.
When using Amazon.YesIntent, you should maintain a state machine in your SessionAttributes pointing to the last invoked intent. For example if two or more of your intents have a possible case where the user response could invoke a YesIntent, you should store the last invoked intent name and the associated session data in the SessionAttributes. So in the function which handles the YesIntent, you should check the state of your previous invocations and delegate to control to the corresponding intent handler.
In my case I will store previously invoked intent name as key and its associated data as it's value in the session.attributes;
"session": {
"new":"false",
"sessionId": "sessionId",
"application": {
"applicationId": "applicationId"
},
"attributes": {
"PreviousIntent": {
"PreviousIntentData"
}
}
In the function which handles YesIntent, check the for the previous state (session.attributes.PreviousIntent) and delegate the control to the function which handles that intent.

Voting Network Architecture

Alright, My question could be a bit long, so kindly bear with me.
I don't have any bug or some error but it's more about conceptual question.
I am building a social network website, much like 9gag.com , So I have this up-votes and down-votes feature which will be associated with Posts, Comments and Replies made by the user.
I am building this website on Laravel and Angular and as my back-end and front-end frameworks respectively. And I have managed to built it as well.
Now everything is working great except I am sending 2 requests (HTTP) per comment and reply to ensure whether a user has already upvoted or downvoted that certain comment or reply. and in backend I do my querying to find whether he did or not and apply ng-classes accordingly but that just takes too much time and well obviously if there are 10 comments (limit) then it will make 20 requests, so my question is how to handle these whether logged In user has did this or not scenarios in the most elegant way?
Include the vote status in the API response that lists the comments.
Your API response should looks something like this for a logged-in user:
[{
"id": 123,
"text": "This is the comment. I voted it up.",
"vote": "up"
},
{
"id": 456,
"text": "This is another comment. I haven't voted on it.",
"vote": null
}
{
"id": 789,
"text": "This is another comment. I didn't like this one so I voted it down.",
"vote": "down"
}]

Passing variables into Watson Dialog

In many situations, it may be helpful to pass known information (e.g. the user's name to present a personalized greeting) into a new Watson Dialog conversation so to avoid asking the user redundant or unnecessary questions. In looking at the API documentation, I don't see a way to do that. Is there a best practice method for passing variables into a Watson Dialog conversation?
In the Dialog service a variable is part of a profile that you create to store information that users provide during conversations.
The following code shows an example of a profile variable that saves the user's name.
<variables>
<var_folder name="username">
<var name="username" type="TEXT" description="The user's name."></var>
</var_folder>
</variables>
In your scenario you will set this variable by calling:
PUT /v1/dialogs/{dialog_id}/profile
with:
{
"client_id": 4435,
"name_values": [
{
"name": "username",
"value": "Bruce Wayne"
}
]
}
Don't forget to replace {dialog_id} and {client_id}.
We have an API Explorer that let you try-out the APIs: Dialog API Explorer.
You can also read more about this in this tutorial.
It should also be noted that if you leave the client_id out then one is allocated for you. You can then pass this in to the start conversation call to make sure the that the profile is picked up. I found this useful where I have welcome messages which I want to imbed profile variables in to e.g. "Hello "

Multiple Menu Items on a card, including pinning

EDIT: After the excellent answer below by Prisoner I'm leaving the question for my humility and for search posterity, but please note I made a mistake in the formation of my question. I misunderstood one piece of background documentation - multiple menu Items per card ARE supported.
I am trying to place a fixed card in the pinned group (left of the home card) and let the user select it and submit a reply. The application thinking is that this lets the user submit commands to the web app which the web app then processes and sends response cards to the user.
I've done the research to know I can't set isPinned to true directly from my app, instead that has to be done by the user via a menuItem. I have that working. For example this works to let a user pin my card:
{
"text": "Test pinnable card",
"menuItems": [
{
"action": "TOGGLE_PINNED",
"values": [{
"displayName": "Pin Card",
"iconUrl": "https://<Graphics URL>"
}]
}
]
}
That is working and arrives at my Glass just fine and I can pin and unpin it no problem.
But once I've set that menuItem to allow a user to pin the card, is there a way to let the user reply? According to this entry there can only be one menuItem per card. That would seem to imply any pinned card can't have menu items and therefore no reply functions (at least I don't know another way to do replies).
I would very much like to let the user select the card and send voice replies. I can do that in a NON-PINNED card using this menuItem:
"menuItems": [
{
"action": "REPLY",
"values": [
{
"displayName": "Search",
"iconUrl": "https:<Graphics URL>"
}
]
}
So the question is basically whether anybody knows a way to either load both menuItems onto a card or to somehow add or exchange to apply the second menuItem once a card is pinned. My guess would be I can't replace the menuItem after pinning or it could be abused to make cards a user couldn't unpin, but it also seems kind of useless to make any pinned card not have actions.
My apologies if there are "obvious" workarounds, I'm plumb out of ideas.
I have glass, running Glassware on AppEngine, and can test any theories people have. This seems like a pretty basic need for Glassware that would be used alot. I'm working on an enterprise document sorting and data finding application, for those who are curious.
A few things.
First, the answer you referenced doesn't say that there can be only one menuItem per card. What it says is there can only be one menu per set of htmlPages, meaning that each card had to have the same menu. HtmlPages are now deprecated in favor of HTML that is split, partly because of the confusion of that question.
Second, you can absolutely have more than one item in the menuItems setting. Hence the plural and use of the array. :)
Third, it looks like you are trying to set "values" for card actions that do not take values (TOGGLE_PINNED and REPLY). Values are only needed for CUSTOM actions.
Fourth, make sure that you have a "creator" set for the REPLY type.
See https://developers.google.com/glass/v1/reference/timeline/insert for details, but in general, what you will need to do is to set the menuItems field to an array, with each element in the array having a different action. You will also need a creator field set to reply to. So something like this should work (although I haven't tested this specific one):
{
"text": "Test pinnable card",
"creator": {
"id": "harold"
"displayName": "Harold Penguin",
"imageUrls": ["https://developers.google.com/glass/images/harold.jpg"]
},
"menuItems": [
{
"action": "TOGGLE_PINNED"
},{
"action": "REPLY"
},{
"action": "CUSTOM",
"values": [
{
"displayName": "Search",
"iconUrl": "https:"
}
]
}
]
}
Finally, you may wish to reconsider using a pinned card to do this. This method harkens back to a very app-centric way of doing things, which is somewhat counter to how Glass tends to work. If you would like to add voice commands, consider registering contacts that can accept commands. See https://developers.google.com/glass/v1/reference/contacts for more details.

Resources