What is '<eps>' in Watson speech to text output? - ibm-watson

Our Watson speech to text output includes many instances of the token <eps>. I cannot find any information in the service documentation describing what this is. What is it?

Related

How do I modify speech contexts in Dialogflow fulfillment

An issue I am seeing is that when I ask in dialogflow for the user to spell out their user id like joesmith2014, there are a large number of errors. The follow post suggested that I can fix this by using speech context to tell the speech to text engine that the user will be spelling out alphanumerics.
https://stackoverflow.com/questions/62048288/dialogflow-regex-alphanumeric-speech
I can't figure out how you would do this while using the actions-on-google library or can this not be down in the fulfillment webhook?
Thanks.
As an example, I created an agent called “alphanumeric” because it will accept any alphanumeric value I send following the next steps:
Check the box regexp entity
Add a single entry, ^[a-zA-Z0-9]{3}[a-zA-Z0-9]*$
Then save it
Your agent should look something like this:
Please note that the regexp entity I added is strict in that it is looking only for a string of alphanumerics, without any spaces or dashes. This is important for two reasons:
This regexp follows the auto speech adaptation requirements for enabling the "spelled-out sequence" recognizer mode.
By not looking for spaces and only looking for entire phrases (^...$), you allow end-users to easily exit the sequence recognition. For example, when you prompt "what's your order number" and an end-user replies "no I want to place an order", the regexp will reject and Dialogflow will know to look for another intent that might match that phrase.
If you are only interested in numeric values, you can create a more tailored entity like [0-9]{3}[0-9]*, or even just use the built-in #sys.number-sequence entity.
Dialogflow ES fulfillment cannot affect speech recognition quality because speech-to-text processing happens before the Dilaogflow request is sent to Dialogflow fulfillment. Check the diagram in the Dialogflow ES Basics documentation.
You can improve speech recognition quality either by enabling auto speech adaptation in the agent settings or by sending speech contexts in the Dialogflow API requests. Note that speech contexts sent via API override implicit speech context hints generated by auto speech adaptation.
If you use regexp entities, make sure that your agent design meets all the requirements listed in this speech adaptation with regexp entities document. See an example of how an intent that collects the employee ID and satisfies these requirements may look like:
When testing the agent, make sure that you test it consistently via voice, including the inputs preceding the utterance expected to match a regexp entity.
This tutorial for iterative confirmation of spoken sequences may also help with the agent design.

IBM Watson, how to input data of entire books

Im using the IBM Watson analytics trial, it says it only takes data as CSV, Excel and a few others. How can i convert books or bodies of text into an acceptable format? thank you
It seems like the architecture of WCA(Watson Context Analytics) does not support PDF itself. Please refer the following images from IBM Link
I think it would be better to convert pdf to text with converter such as CONVERTER and pushing it into database or others.
Then, you can crawing the text data from it.
FYI, the document has to have a KEY column (i.e. name of the book).
Even if you do convert your book into an acceptable text format (.csv. .xls, .xlsx. .sav), Watson Analytics isn't optimized for text analytics. It sounds like Watson Explorer is the offering that'd best suit your needs.
Hope this helps.
Even though CSV or XLS is the acceptable format of the file, Datasets needs to be in the specific structure. You need headers for all the tables and data following it. I am not sure how a data of the book can fit into that format.
I have recently published this blog post on how to structure and refine data before importing into Watson Analytics to get the best results.
For your specific requirement, you can look into Watson Explorer as suggested by Brennan above, or even better you can learn to use IBM Content Analytics here.

What is the data format required for input in the IBM-Watson cloud product?

I'm having a hard time figuring out what type of data watson accepts: RDF triples, relational, delimited text, etc.
There's really no documentation anywhere.
Does anyone know?
Watson currently eats unstructured English Prose in HTML, Word Doc, Text, and certain formats of PDF.
Some API documentation can be found here: https://www.ibmdw.net/watson/wp-content/uploads/sites/19/2013/11/An-Ecosystem-Of-Innovation-Creating-Cognitive-Applications-PoweredByWatson.pdf
You can also get a bit more if you go to the bottom of the mobile developer challenge page that's here: ibmwatson.com (see 'Helpful Hints about Watson')
If there's other documentation you're looking for, specific feedback would be helpful to pass on

Mirror API - Accessing the actual speech binary rather than translated text

Is it possible to access the actual speech recording, rather than the transcoded speech text from Mirror API
This is not currently available in the Mirror API.
If this is something you need, file an enhancement request and include a description of what you plan to use it for.

silverlight text to speech?

Any silverlight text to speech engine available now? I am looking for very simple text to speech engine which needs to read out numbers.
I dont want to rely on any web service.In worstcase I will record some voices for numbers and stitch them together.
Any pointers are highly appreciated. My application need not work on MAC or linux.
There is another option, which doesn't involve ActiveX or Silverlight 4 COM interop. You simply have your Silverlight application send the text to a WCF service which will convert the text to a WAV stream and then decode the stream returned by the service and place it in a MediaStreamSource for playback in Silverlight. I wrote a blog post on it and it includes sample code.
http://www.brianlagunas.com/index.php/2010/03/07/text-to-speech-in-silverlight-using-wcf
Converting text to speech using speech SDK consists of a few simple steps. The following code shows the important pieces in performing text to speech.
dynamic textToSpeech = AutomationFactory.CreateObject("Sapi.SpVoice");
textToSpeech.Volume = book.Volume;
textToSpeech.Rate = book.SpeekSpeed;
textToSpeech.Voice = book.speeker;
textToSpeech.Speak(book.Content);
SpVoice is the class that is used for text to speech conversion. The speak method takes in a string that needs to be spoken.
code sample: http://funducodes.blogspot.com/p/silver-light.html
You will probably have to build your own for a truely cross compatible application.
Silverlight 3: Use active X to call the Microsoft Speech SDK. (not recommended at all)
Silverlight 4: Use COM integration to call the Microsoft Speech SDK.
These will work on windows only OS.
Of course, with all these suggestions, the underlying flaw is in the speech rendering engine itself - every one of these sample results in a nasty clicking at the start of the speech, I'm thinking this is garbage collection on the stream.
Would be nice to finally have something cross platform that can create realistic speech.
I am not holding my breath.

Resources