How to get actual phrase user said for failed utterance? - alexa

The Alexa Analytics dashboard provide a view of how many utterance are successful and how many are failed as like below.
But I can't drill down to see what exact phrase customer used for my failed utterance.
In addition, Alexa does not provide any feature to programatically get what exact phrase used by customer while invoking the skill.
So, as a developer this is a big limitation towards improving skill's ability of recognizing various type of inputs. :-(
Isn't there any way to get what utterance user used ?

I can't drill down to see what exact phrase customer used for my failed utterance
"Failed Utterances" I believe means that the customer invoked an intent and that intent resulted in a failure, not that the utterance couldn't be match to an intent. If you take a look at the "Intents" tab in Analytics you should see a corresponding failure on both the "Failed Utterances Per Intent" and the "Failed Intent" graphs. This could help you find what's causing the failure.
Isn't there any way to get what utterance user used ?
While you can't, to my knowledge, match the failed intent with a specific phrase (and it may not even be relevant to do so), it might be possible to look at the general phrases spoken by users in Intent Request History. I say might because the "skill must have at least 10 unique users per locale in a day, in order for data to be available for that locale for that day". Though even then, this is only the most frequent utterances spoken, but it could never the less be worth taking a look at.

Related

Maximum duration of user speech input in seconds for Amazon Alexa skill

I am creating an Amazon Alexa skill and would like to know the maximum duration of user input in seconds that a slot can hold. More specifically the AMAZON.SearchQuery type slot.
I'm not sure there is an official answer to this anywhere in public docs, but I don't think you'll be able to capture more than a few seconds (~8 max ?) of input. Plus if Alexa detects the user is done speaking, she will stop listening and process the utterance. Even a slight pause could be interpreted as the end of speech input.
I don't know your particular use case, but given all that, I would not recommend that slot type as a reliable way to capture long transcriptions. I don't believe there is a good way for skills to do at all currently.
This Amazon Lex blog is from 2017. I don't know if it will still work, but you can give it a shot.
Capturing Voice Input in a Browser and sending it to Amazon Lex

Logging requests into database

Should I log requests info (client ip, request status code, execution time etc.) in my web app into the database to analyse users behavoir and arised errors? And what info log for better experience?
Its often tempting to log lots of information, however I usually find that when I come to use it to answer a question it's often the case that the wrong piece of information has been recorded or only partially. Or it has been recorded but has not been stored in a usable way and takes further programming to turn the log into meaningful information.
So I would start with the question of what you want to see/find and log accordingly. Generally then logging capability can be expanded in the future as new issues/insights are required.
remember every time you log something you are slowing your application down. You are also using more disk space, no one is going to thank you for buying more disk / longer backups just because you have logged everything on every action.
I guess I would follow a train of thought a bit like:
1) What are you trying to find, if its an error you can predict then why not cater for it in your code to start with. If its usability what format does the data need to be in at at what points should it be recorded.
2) How long do you need it for, be sure to purge the logs after a period to conserve disk space.
3) Every element stored is a performance hit, might be small but for high number of transactions it adds up.
4) Be wary of privacy rules, an IP address may be considered as identifiable data, in which case you need to publish a data privacy policy (see point 2).
5) Consider using a flag to control logging on or off. Then you can use it at times of a known issue but not record everything always when not needed.

See when my skill reprompts users

Is there a way to see if my skill has reprompted a user? I know this happens if a user doesn't respond within an allotted amount of time and it would be informative because I'd like to rephrase some of my responses if this tends to happen more frequently for some utterances than others.
No. At this point, this isn't possible. You might be able to use a timer in your skill service to guesstimate however.

Best implementation of turn-based access on App Engine?

I am trying to implement a 2-player turn-based game with a GAE backend. The first thing this game requires is a very simple match making system that operates like this:
User A asks the backend for a match. The back ends tells him to come back later
User B asks the backend for a match. He will be matched with A.
User C asks the backend for a match. The back ends tells him to come back later
User D asks the backend for a match. He will be matched with C.
and so on...
(edit: my assumption is that if I can figure this one out, most other operation i a turn based game can use the same implementation)
This can be done quite easily in Apple Gamecenter and Xbox Live, however I would rather implement this on an open and platform independent backend like GAE. After some research, I have found the following options for a GAE implementation:
use memcache. However, there is no guarantee that the memcache is synchronized across different instances. I did some tests and could actually see match request disappearing due to memcache mis-synchronization.
Harden memcache with Sharding Counters. This does not always solve the multiple instance problem and mayabe results in high memcache quota usage.
Use memcache with Compare and Set. Does not solve the multiple instance problem when used as a mutex.
task queues. I have no idea how to use these but someone mentioned as a possible solution. However, I am afraid that queues will eat me GAE quota very quickly.
push queues. Same as above.
transaction. Same as above. Also probably very expensive.
channels. Same as above. Also probably very expensive.
Given that the match making is a very basic operation in online games, I cannot be the first one encountering this. Hence my questions:
Do you know of any safe mechanism for match making?
If multiple solutions exist, which is the cheapest (in terms of GAE quota usage) solution?
You could accomplish this using a cron tasks in a scheme like this:
define MatchRequest:
requestor = db.StringProperty()
opponent = db.StringProperty(default = '')
User A asks for a match, a MatchRequest entity is created with A as the requestor and the opponent blank.
User A polls to see when the opponent field has been filled.
User B asks for a match, a MatchRequest entity is created with B as as the requestor.
User B pools to see when the opponent field has been filled.
A cron job that runs every 20 seconds? or so runs:
Grab all MatchRequest where opponent == ''
Make all appropriate matches
Put all the MatchRequests as a transaction
Now when A and B poll next they will see that they they have an opponent.
According to the GAE docs on crons free apps can have up to 20 free cron tasks. The computation required for these crons for a small amount of users should be small.
This would be a safe way but I'm not sure if it is the cheapest way. It's also pretty easy to implement.

What is the best way to do basic View tracking on a web page?

I have a web facing, anonymously accessible, blog directory and blogs and I would like to track the number of views each of the blog posts receives.
I want to keep this as simple as possible, accuracy need only be an approximation. This is not for analytics (we have Google for that) and I dont want to do any log analysis to pull out the stats as running background tasks in this environment is tricky and I want the numbers to be as fresh as possible.
My current solution is as follows:
A web control that simply records a view in a table for each GET.
Excludes a list of known web crawlers using a regex and UserAgent string
Provides for the exclusion of certain IP Addresses (known spammers)
Provides for locking down some posts (when the spammers come for it)
This actually seems to do a pretty good job, but a couple of things annoy me. The spammers still hit some posts, thereby skewing the Views. I still have to manually monitor the views an update my list of "bad" IP addresses.
Does anyone have some better suggestions for me? Anyone know how the views on StackOverflow questions are tracked?
It sounds like your current solution is actually quite good.
We implemented one where the server code which delivered the view content also updated a database table which stored the URL (actually a special ID code for the URL since the URL could change over time) and the view count.
This was actually for a system with user-written posts that others could comment on but it applies equally to the situation where you're the only user creating the posts (if I understand your description correctly).
We had to do the following to minimise (not eliminate, unfortunately) skew.
For logged-in users, each user could only add one view point to a post. EVER. NO exceptions.
For anonymous users, each IP address could only add one view point to a post each month. This was slightly less reliable as IP addresses could be 'shared' (NAT and so on) from our point of view. The reason we relaxed the "EVER" requirement above was for this sharing reason.
The posts themselves were limited to having one view point added per time period (the period started low (say, 10 seconds) and gradually increased (to, say, 5 minutes) so new posts were allowed to accrue views faster, due to their novelty). This took care of most spam-bots, since we found that they tend to attack long after the post has been created.
Removal of a spam comment on a post, or a failed attempt to bypass CAPTCHA (see below), automatically added that IP to the blacklist and reduced the view count for that post.
If a blacklisted IP hadn't tried to leave a comment in N days (configurable), it was removed from the blacklist. This rule, and the previous rule, minimised the manual intervention in maintaining the blacklist, we only had to monitor responses for spam content.
CAPTCHA. This solved a lot of our spam problems, especially since we didn't just rely on OCR-type things (like "what's this word -> 'optionally'); we actually asked questions (like "what's 2 multiplied by half of 8?") that break the dumb character recognition bots. It won't beat the hordes of cheap labour CAPTCHA breakers (unless their maths is really bad :-) but the improvements from no-CAPTCHA were impressive.
Logged-in users weren't subject to CAPTCHA but spam got the account immediately deleted, IP blacklisted and their view subtracted from the post.
I'm ashamed to admit we didn't actually discount the web crawlers (I hope the client isn't reading this :-). To be honest, they're probably only adding a minimal number of view points each month due to our IP address rule (unless they're swarming us with multiple IP addresses).
So basically, I'm suggested the following as possible improvements. You should, of course, always monitor how they go to see if they're working or not.
CAPTCHA.
Automatic blacklist updates based on user behaviour.
Limiting view count increases from identical IP addresses.
Limiting view count increases to a certain rate.
No scheme you choose will be perfect (e.g., our one month rule) but, as long as all posts are following the same rule set, you still get a good comparative value. As you said, accuracy need only be an approximation.
Suggestions:
Move the hit count logic from a user control into a base Page class.
Redesign the exclusions list to be dynamically updatable (i.e. store it in a database or even in an xml file)
Record all hits. On a regular interval, have a cron job run through the new hits and determine whether they are included or excluded. If you do the exclusion for each hit, each user has to wait for the matching logic to take place.
Come up with some algorithm to automatically detect spammers/bots and add them to your blacklist. And/Or subscribe to a 3rd party blacklist.

Resources