I see from links like this, I can get Alexa to say arbitrary things in response to a keyword. The problem with this is I am limited to 250 characters and have no control over her timing---she just rushes everything together.
Is there a development interface that would allow me to do something like:
Say text1.
Wait 20 seconds.
Say text2.
Wait 20 seconds.
Etc?
I imagine this is the developer interface that is commonly used for things like meditation or workout apps. I really just need to know what to google to see what this interface is called and read docs---my searches for "Alexa Routines" "Alexa say things on a timer" return things about her skills called "routines" and "timers" not how to develop these.
It's called SSML: Speech Synthesis Markup Language
And you can do much more than just waiting.
<speak>
There is a three second pause here <break time="3s"/>
then the speech continues.
<break time="3s"/>
something else
</speak>
However, there is a rule where you can't allow all the break to be more than 10s.
Break tag silence can't exceed 10 seconds, including scenarios with
consecutive break tags. SSML with more than 10 seconds of silence
isn't rendered to the user.
If you need to wait more than 10sec, then you can just return your own audio.
Related
I want to create a skill that's a simple game, where first the user launches the skill with its invocation name and then Alexa asks a question, "Shall I roll the dice?" If the user answers "Yes," it rolls the dice, and says the result. Then Alexa asks again, "Shall I roll the dice?" If "Yes," do the same thing. This is the main loop I'm talking about, and it'll continue until the user answers "No" or "Quit" to this question.
I just can't figure out how to add the loop, or where it should go. I've looked at tutorials and videos and whatnot and just nothing I've found mentions a loop which I find really odd. But I'm a noob at this.
Any help would be awesome. I've been wanting to do this skill for so long but just am stuck on this loop thing.
I recommend you to take some time to understand how a skill work and then I recommend you to develop a quiz skill from this doc
You will then have a better understanding of how a request is made to Alexa service and how a response is returned. The logic behind Intent, how does a slot work, ...
An Alexa Skill is like a card game. The player can select any card at any time. Each card has its own function and is triggered by a voice.
So when the skill first asks the user for Shall I roll the dice?, the user will say either yes or no.
If the user says yes, it will then go to your AMAZON.YesIntent,
If the user says no, it will then go to your AMAZON.NoIntent.
But you also need to make sure that the user can also say:
Stop > Amazon.StopIntent
Anything else, such as, cheese > FallbackIntent
By doing the quiz skill cited above, you will understand how to build your interaction model effectively.
A loop is straightforward. If the user replies yes, then in your intent handler for AMAZON.YesIntent, you will need to trigger the same function that will inject, in the response builder the prompt: Shall I roll the dice ?.
Keep in mind that a user can also ask to repeat. Imagine a skill being a personal assistant. It's not a voice mail. There are many other ways to say Shall I roll the dice? to not sound like a robot. Try implementing different response values possible to have a great customer experience overall.
I have a skill that elicits a U.S. state and county from the user and then retrieves some data. The backend is working fine, but I am concerned about how to structure the conversation. So far, I have created an intent called GetInfoIntent, which has two custom slots, state_name, and county_name
There are about 3,000 U.S. counties with many duplicate names. It seems silly to me that I am asking for a county, without first "narrowing down", by states. Another way I can think of to do the conversation is to have 50 intents, "GetNewHampshireInfo, GetCaliforniaInfo, etc. If I did it this way, I'd need a custom slot type for each state, like nh_counties, ca_counties. etc.
This must be a pretty generic problem. Is there a standard approach, or best practice, I can use?
My (not necessarily best practice) practice tips:
Single slot for single data type. Meaning only have one slot for a four digit number even if you use it in more than one place for two different things in the skill.
As few intents as you need
no more no less. You certainly can and should break up the back end code with helper code, but try and not break the intents into too many smaller pieces. It can lead to difficulty when Alexa is trying to choose the intended intent.
Keep it voice focused. How would you ask in a
conversation. Voice first development is always the way to go.
For the slot filling I think it is fine to ask both state and county.
If the matching is not correct ask for confirmation.
Another option is to not use auto filling within the Alexa skill and use the dialog interface. Ask the county first and then only when it has more than one state option and is ambiguous continue the dialog to fill the state.
Even if you did have 50 separate intents you really never want to have two slots that can be filled by the same word. For example having a mo_counties and ky_counties that Clack satisfies both is ambiguous and can cause unneeded difficultly.
So for someone looking for the "best practice" I have learning that there isn't one yet (maybe never will be). Do what makes sense for the conversation and try and keep it as simple as it needs to be and no less on the back end.
I also find it helpful to find a non-developer to test your conversation flow.
This wasn't really technical and is all opinion, but that is a lot of what Alexa development is. I would suggest Tuesday Alexa office hours at https://www.twitch.tv/amazonalexa very helpful and you can ask questions about stuff like this.
Whenever I go inside the skill and say one completely random word, the Fallback Intent is not triggered. The echo will just emit a sound and in the Alexa simulator, it would just show nothing. But I know for a fact that I am still inside the skill and the session has not yet ended since if I try to say an utterance that is mapped to a certain intent without including the word Alexa, it would respond correctly. BUT, if I try to say TWO completely random words the Fallback Intent is triggered. For example(this is already inside the skill), if I say the word "pizza" it would just respond with that weird noise and stay in the current session. But if I say the words "pizza pie" it would map to the Fallback Intent.
I have observed this behavior in a skill that has many custom intents each having many utterances configured. But when I tried inputting the word "pizza" to a skill with only 3 custom intents, the Fallback intent works fine.
If, when you say the out-of-domain word, you get a reprompt and then and end of session it means that Alexa assigned a very low confidence level to the mapping of that utterance to an intent. And this also applies the fallback intent!
Every time you build your model and out-of-domain model for fallback is built in parallel. That model is supposed to catch out-of-domain utterances but it's not perfect. Only utterances with a high confidence of matching the fallback model will be routed to the fallback intent. This is by design (fot the current version) meaning that not all utterances (both low and high confidence) will trigger fallback when fallback is the candidate. So what you're seeing here is an utterance that generates a low confidence for fallback (fallback is the best chosen candidate but confidence is too low). As fallback gets better it will become more effective at capturing these cases. A rather awkward solution (which defeats the purpose of fallback I guess) will be to extend fallback with sample one word utterances similar to the ones you're trying. Hope this helps...
Update: FallbackIntent sensitivity tuning was added recently so now, if you set it to high in the voice interactions model, it will work as you expected!
I think I just found the solution...
Having single word on any intent, Alexa by mistake (obviously), tends to match a single word interaction with some intent with a single word as sample utterances!!!!!!
As I see, Alexa uses the terms and the term count to calculate the statistical match with user interaction... wow!
Hope it helps you guys!
lets say i have a skill with 2 custom intents, 'FirstIntent' and 'SecondIntent'. SecondIntent also has a required slot, 'reqSlot'.
Now, i would like to sequence my intents. After my skill sent the FirstIntent-response, i would like Alexa to send a request with SecondIntent and a directive to elicit reqSlot, without the user to invoke it.
They say here, at the parameter 'updatedInted':
"Note that you cannot change intents when returning a Dialog directive, so the intent name and set of slots must match the intent sent to your skill."
Is this generally possilbe or did anyone figure out a workaround for this scenario?
Thanks :)
There are ways to handle this.
You can try:
When you send your first response it must set the shouldEndSession flag to false.
The end of your first response's output speech should lead the user into invoking the second response. For example: 'Say telephone number followed by your number'.
This way the user doesn't need to explicitly invoke your skill to get to the next intent.
It is not currently possible to cause Alexa to start speaking without a user first having spoken to it.
So for example, I cannot create a skill that will announce to my wife that "Ron is on his way home" whenever I leave work.
I can however, create a skill that my wife can ask "Is Ron on his way home", and it can respond with the answer.
Also, the new notifications allow a skill to post a notification, but this just causes the Alexa to light up its circular ring to indicate that a notification is waiting. A user must still ask Alexa to read it. In the example I cite above, that might be ok.
A lot of us would love for Alexa to be able to spontaneously start talking, but Amazon has not enabled that. Can you just imagine the opportunity for advertising abuse that functionality might enable? Nothing like sitting down watching TV and having Alexa start talking, "Hey, wouldn't some Wonder Popcorn taste great about now? We're having a sale..."
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
This isn't really a programming question, more of an ideas question. Bear with me.
My sister gave me a well-used Nokia N95. I don't really need it, but I wanted it to do some programming for it. It supports a few languages, of which I can do Python.
My question is this: what to do with it? If I think about it, it has a lot to offer: i can program the GPS, motion sensor, wireless internet, sound and visual capture; it has a lot of hard disk space, it plays sound and video and so on.
The combinations seem limitless. The way I see it, it is a device that is easily always on me, has access to a huge data repository (the internet, and my personal data in it) and can be aware if I'm sitting at home, at work, or moving about somewhere. It could basically read my google calendar to check if I should be somewhere I'm not -- perhaps give me the bus schedule to get to where I should be. It could check if it's close to my home and therefore my home PC bluetooth/wifi. Maybe grab my recent work documents from my desktop computer, along with the latest Daily Show, for the bus journey to work. It could check my library account to see if any of my books are due, and remind me to take them with me in the morning. Set up an alarm clock based on what shift I have marked in my google calendar.
Basically I have a device that can analyze my movements in time (calendars with my data etc) and space (gps, carrier cell ids). By proxy, it could identify context situations -- I can store my local grocery store gps coordinates or cell mast ids and it could remind me to bring coffee.
Like I said, the possibilities seem limitless, and therefore baffling. Does anyone else have these pseudofantastical yearnings to program something like this? Or any similar ideas? How could this kind of device integrate into -- and help -- your life?
I'm hoping we could do some brainstorming.
"Gotta Leave" - A reminder that figures out the bus time, how far you are from a stop on your bus and shows a countdown till you "Could" leave (green), "Should" Leave (yellow), "Must" leave (orange), and "Gotta Run to get there" (red).
As inputs it needs what bus number you want to ride. You turn it on, it finds you, finds your closest few bus stops, estimates your walking speed at 2/mph and calculates when you need to leave where you are to get to the bus with 5 minutes waiting or less.
You should just pick any one and implement it.
It doesn't matter where you start, more that you actually do start. Don't concentrate on the destination, take a step and see what the journey holds.
Do it for a laugh to start and your expectation will be set right for both when you do find your killer app and when you don't.
"Phone home" - an interface to report home if you send a message to your phone that it is lost / stolen. Must be a silent operation from the phone holder's perspective
Options:
Self destruct mode to save your data from prying eyes
Keep calling with it's location every 10 minutes until an unlock is sent indicating the phone is found.
This is the same problem I face with the android (albeit java instead of python). The potential is paralyzing :)
I'd recommend checking out what libraries have already been written for doing cool stuff on that phone, and then building off of them- It's a system that provides inspiration, direction, and a good head start. For instance, on the android side, I'm fooling around with "zxing", a library that lets you read barcodes via the cellphone's camera. That's it's own sub-universe of possibilities, but at least it gives me a direction to go. "do cool things with information about products physically nearby"
"Late for Work" - Determines if you are not at work, buzzes you with a reminder and preps the phone to call into the sick line. Could be used if you are going to be late as well.
Inputs: Your sick line number. Time you should be at work. Where your home is, where your work is
Optional:
Send a text message
Post to an online in/out board
If you are still at home, sound an alarm
If you are still at home, call in sick, if you are not at home sent a "I'm going to be late" message
Comedy Option:
- If you don't respond to ten alarms, dial 911
To add on to what others have said, come up with some kind of office-GPS (via WiFi maybe? Does it have WiFi?) and tell you when you need to go to a meeting.