i wrote a simple Q&A Alexa skill which ask the user to guess about the planet name based on its property.
questions are like "Which is the brightest planet in the solar system"
when user respond as Venus. Alexa says that the answer is incorrect. the correct answer is venus.
I am not sure why cant it recognize.
There are a few places things can be going wrong.
1) Just because the user said it, doesn't mean that's what Alexa heard. Did you confirm in the companion app that Alexa heard the word "venus"? Did you try the simulator and type in Venus? That would get past it parsing what you said.
2) How are you testing the answer? Alexa, typically, returns things in lower case, since there is no casing in spoken language. Venus is a proper name, so I'm not sure it would return it as upper case or lower case. Either way, if you are using a case sensitive string compare then you need to make sure the cases match, or else use a case insensitive string comparison. If you are using Javascript, tips on doing case insensitive comparisons are here.
3) How are you recognizing the answer? Do you have a separate intent for "Venus"? Do you have a slot for it? Do you use a LITERAL with multiple utterances for examples? Do you use a custom slot? Each of these will return the results in different ways. The best option is to use a custom slot.
4) Have you checked your log files? What is your code actually receiving from Alexa? If your code doesn't print it, add extra log statements to see what your code is getting, and what you are doing with it.
You have not given enough information in your question to answer it definitively. Hopefully the above will give you ideas how to work the answer out yourself, or will prompt you to update your question with better information.
Related
I am working on an Alexa Skill and am having trouble for Alexa to understand my voice input. Therefore the utterances are not properly matched with the required slots... and alexa is always re asking or getting stuck.
Here are some examples:
affirm: f.m., a from
Speedbird: Speedboard, speaker, speed but, speed bird, spirit, speedbath
wind: windies (wind is), when is home (wind is calm)
runway 03: runway sarah three
takeoff: the cough
Any solution to training Alexa to properly understand me? Or should I just write as utterance all these "false" utterances so alexa will match my intents properly?
Thanks for any help!
There is no chance to train the language understanding itself of Alexa.
Yes, as you wrote: I would just take these false utterances as matches for your intent.
This seems also what is recommended by amazon:
...might show you that sometimes Alexa misunderstands the word "mocha" as
"milk." To mitigate this issue, you can map an utterance directly to
an Alexa intent to help improve Alexa's understanding within your
skill. ....
two common ways to improve ASR accuracy are to map an intent value or
a slot value to a failing utterance
Maybe give an other person a try to see if it's recognized the same way as your speech.
Word-Only Slots
If you're still struggling with this, you should try adding more variations to your slot values (synonyms are an option if you have specific interpretations that keep repeating). Consider adding synonyms like speed bird for Speedbird (and take off for takeoff). Non-standard word slots will not resolve as accurately as common words. By breaking Speedbird into two words, Alexa should more successfully recognize the slot. Information about synonyms are here:
https://developer.amazon.com/en-US/docs/alexa/custom-skills/define-synonyms-and-ids-for-slot-type-values-entity-resolution.html#sample-slot-type-definition-and-intentrequest
Once you've done this, you'll want to grab the canonical value of the slot, not the interpreted value (e.g. you want Speedbird not speedboard).
To see an example, scroll to the very last JSON code block. The scenario described in this request is that the user said the word track with is a synonym for the slot value song in their request. You'll see the MediaType value is track (what the user said) but if you take a look at the resolutions object, inside the values array, the first value object is the actual slot value song (what you want) associated with the synonym.
This StackOverflow goes a little more into the details on how you get that value:
How do I get the canonical slot value out of an Alexa request
Word and Number Slots
In the case of the "runway 03" example, consider breaking this into two different slots, e.g. {RunwaySlot : Custom} {Number : Amazon.Number}. You'll have better luck with these more complex slots. The same is true for an example like "red airplane," you'll want to break it into two slots: {Color : Amazon.Color} {VehicleSlot : Custom}
.
https://developer.amazon.com/en-US/docs/alexa/custom-skills/slot-type-reference.html#number
I have a skill that elicits a U.S. state and county from the user and then retrieves some data. The backend is working fine, but I am concerned about how to structure the conversation. So far, I have created an intent called GetInfoIntent, which has two custom slots, state_name, and county_name
There are about 3,000 U.S. counties with many duplicate names. It seems silly to me that I am asking for a county, without first "narrowing down", by states. Another way I can think of to do the conversation is to have 50 intents, "GetNewHampshireInfo, GetCaliforniaInfo, etc. If I did it this way, I'd need a custom slot type for each state, like nh_counties, ca_counties. etc.
This must be a pretty generic problem. Is there a standard approach, or best practice, I can use?
My (not necessarily best practice) practice tips:
Single slot for single data type. Meaning only have one slot for a four digit number even if you use it in more than one place for two different things in the skill.
As few intents as you need
no more no less. You certainly can and should break up the back end code with helper code, but try and not break the intents into too many smaller pieces. It can lead to difficulty when Alexa is trying to choose the intended intent.
Keep it voice focused. How would you ask in a
conversation. Voice first development is always the way to go.
For the slot filling I think it is fine to ask both state and county.
If the matching is not correct ask for confirmation.
Another option is to not use auto filling within the Alexa skill and use the dialog interface. Ask the county first and then only when it has more than one state option and is ambiguous continue the dialog to fill the state.
Even if you did have 50 separate intents you really never want to have two slots that can be filled by the same word. For example having a mo_counties and ky_counties that Clack satisfies both is ambiguous and can cause unneeded difficultly.
So for someone looking for the "best practice" I have learning that there isn't one yet (maybe never will be). Do what makes sense for the conversation and try and keep it as simple as it needs to be and no less on the back end.
I also find it helpful to find a non-developer to test your conversation flow.
This wasn't really technical and is all opinion, but that is a lot of what Alexa development is. I would suggest Tuesday Alexa office hours at https://www.twitch.tv/amazonalexa very helpful and you can ask questions about stuff like this.
I want to write a quiz/interview game where the flow is like this:
"Alexa, start Movie Trivia."
Welcome to Movie Trivia. Do you need to hear the rules?
"No."
What category would you like to play? Comedy, drama, or animation?
"Comedy."
Question 1. In what year was Star Wars released? A, 1970. B, 1977. C, 1980.
"B."
Correct. Your score is 1. Question 2...
I managed to write spaghetti code to accomplish this, with lots of if session.attributes.category, if session.attributes.needsRules, etc stuff, 3 pages of nested if-elsing.
I'm using Node and the official Alexa SDK, so I read its documentation cover to cover, but it's quite confusing and broken in places (examples that haven't worked since June, instructions for old UIs and so on). My question is: what kind of flow is 'correct'/traditional for something like this?
In the code I was writing, I used elicitSlot a lot, which is nice because it lets me listen solely for the things I expect to hear (eg answerType "A", "B", "C"). But elicitSlot leads to you re-triggering the same intent. So would it be a matter of having each intent check if a slot is filled, and if not, speak a question and elicit that slot, and if so, set a session attribute and then forward to a different intent?
That seems sloppy. Maybe the solution is to define an askingRulesState, askingCategoryState, askingQuestionState, etc, each with only a single handler. But states with only a single handler seems... wrong?
If I'm going to ask the user a question like "What category would you like to play?", does that mean I need to create a SetCategoryIntent? And if so, how would I prevent the user from triggering that intent except when I want them to?
I realise this is a bit of a big vague question but it's really difficult for me to boil it down to something smaller and clearer, since my issue is that the flow in general is really disorienting to me. I'd appreciate even the smallest tip!
You might have a look here, this will handle a lot of the if else's and elicit slots you wrote. For the questions and such you wil indeed probably have to make a state so you can check if you are in question asking state or just in the set-up state. This will help your skill deter men what it has to do. (don't forget to ALWAYS put this state back in because Alexa is tricky if you do not do this sometimes. You can find more info over here. This also looks like a pretty good example of what you are trying to make.
Hope this helps you forward a bit.
I am developing an Alexa skill, where I have a stop for names of fruits. However, if I speak something like "What is apple's cost" where the slot value has an apostrophe, Alexa does not seem to recognize the apostrophe. Workaround is to say something like "What is the cost of an apple" but that would not be the best customer experience.
How can I make Alexa understand slot value with apostrophes? Any help is appreciated.
I think this is what you are looking for.
Create Intents, Utterances, and Slots (Rules for Sample Utterances)
If the word for a slot value may have apostrophes indicating the
possessive, or any other similar punctuation (such as periods or
hyphens) include those within the brackets defining the slot. Do not
add 's after the closing bracket. For example: ...
My friend, the apostrophe could be parsed depending on the voice recognition system internally, but it will never understand in real time an apostrophe.
Good news though, you dont need the apostrophe, think about it, it is only recognizing what the custommer would say without capital letters and special characters. Meaning, if the custommer says "What is apple's cost", alexa would recognize as the following "what is apples cost". This is a problem that should be worked server-side, cause you only need to understand what the custommer meant. You should implement server side a string matching function using levenshtein's algorithm.
Alexa just doesn't understand the word 'postpaid' and I've tried it a million times in my skill. I also tried "Alexa, Simon says postpaid" but it repeats something else other than postpaid, I don't know why. My sample utterance is like this "what is the {type} sales" and the type has custom slot values "postpaid",etc.
I've looked at AMAZON.LITERAL but didn't quite understand it if it will help me in my case. So any workaround will be helpful and thanks in advance.
What does Alexa think you said? Maybe you can use that in your intent also. Your code can check for and replace whatever that is to "postpaid".
This is a bit of a hack, but may work for you until Amazon provides us with a way to fine tune input.
Alexa will not always restrict the transcription the options in a slot to the given values, specially if you have a large list of possible values. Either using a list or AMAZON.LITERAL, in this case, your best bet may be to check wether the identified value is in fact one of the values in your list and use it, otherwise, you can use a phonetic matching/similarity algorithm to select the closest value.
Hit me up if you need example code (in Python in my case)
This feels simplistic but have you tried breaking postpaid into two words?
{type} == "post paid"
Slots can contain multi word utterances. Perhaps Alexa will recognize the two distinct morphemes.