Connect IBM Speech to Text service to IBM Watson Assistant - ibm-watson

I'm using IBM Speech to Text service (STT) and I want to connect it to IBM Watson Assistant (WA) Plus Plan to allow ask questions in speech instead of text only.
What I want to have is a microphone icon in the chat window, in which after clicking this microphone icon a user can talk and and ask a question.
I tried to the documentation on how to connect STT to WA, however the only thing I found is how to connect STT to WA through a voice telephone line.
Any help, please?
Thanks

With the Watson Assistant web chat, you can connect it to both TTS and STT services.
For TTS, the short explanation is to use the receive event that is fired whenever web chat receives a message. You can send the message to your TTS service to speak the desired text.
For STT, you'll need to add a button of some sort to the UI. You are a little limited here - you won't be able to put a microphone icon inside the input field, but you can put one directly above the input field using one of the writeableElements (beforeInputElement being the most appropriate). Once the button is clicked, you'll make a call to your STT service. When it returns the appropriate text, you can use the send method to send the text to WA.
We even have a complete tutorial showing you how to get all the pieces working together: https://github.com/watson-developer-cloud/assistant-toolkit/tree/master/integrations/webchat/examples/speech-and-text
And links to the relevant documentation:
https://web-chat.global.assistant.watson.cloud.ibm.com/docs.html?to=api-instance-methods#writeableelements
https://web-chat.global.assistant.watson.cloud.ibm.com/docs.html?to=api-events#receive
https://web-chat.global.assistant.watson.cloud.ibm.com/docs.html?to=api-instance-methods#send

Related

Creating a survey in amazon Alexa with open ended questions

I am trying to create a simple survey skill in Amazon Alexa where Alexa asks the user a question and they respond in any manner they feel like (open-ended). For example, if Alexa asks "Do you cook?", a user may respond in many ways such as "Yes I do cook", "My son does the cooking" etc.
The issue I am getting is that the questions can have similar responses so when I create an utterance in the Alexa dev console they overlap (utterance conflicts) and I am redirected to the error handler. (note each question has its own intent)
Is there any other way I can go about creating a survey without using intents?
Can I capture the full user response to a slot?
The reason being I want to store the user's response in a database.
Unfortunately, Alexa Skills aren't designed to do Speech To Text.
When a user talks to a device, the request goes through multiple steps:
Automated Speech Recognition > It does Speech To Text internally
Natural Language Understanding (NLU) > Using Machine Learning, it will understand what the user want to do (Stop a skill, play music, switch on the light, ...)
Depending of the context, if the NLU understand that the user is trying to respond to your skill (the interaction model match what the user is saying), it will send a POST request to your skill. But it will not send you the Speech To Text.
Documentation
Although, the intent AMAZON.SearchQuery will do the job but you will have to use a prefix: My answer is {query} and not directly {query} because all requests will be redirected to this intent otherwise. It will not look like a good & smooth user experience.

Create a dialog node that allows for document upload in Watson Assistant

I have created a chat-bot using IBM Watson Assistant and I am trying to find a way to allow the end user to upload documents through the Watson API. Has anyone else tried to achieve this before?
The Watson service only takes text and then tries to classify and respond to it. Your application layer will have to either process this document into some form of a json string, or just collect it and do whatever else you want with it, and then send some kind of indicator to Watson to move on with the conversation.

How to send user data to developer? Xcode/Swift

I'm creating my first application that requires me to update the app based on user input. I've been searching for the best way to send input to me. For example, I have a button that when the user presses I would like to send me the information they've added to a text field. Being new to this, I thought this could be done by simply sending the information to a specified email, but from what I've researched I will need some sort of database. Looking through the Apple Developer Documentation I don't even know which topic I should be looking at to figure this out, any help or direction would be very helpful!
You need to setup a server (using an API) to receive the information.
Usually you will use a webservice to receive the info from the app, although there are other ways to do that.
Sending an email through iOs would require the user to accept the email that is being sent, so doesn't look like a good idea.
Take a look at some options available to create webservices (django rest framework or flask), Google's Firebase also can be handy in this situation, since is only integrating it with your app and storing the data you want to store, with easy integration for Authentication and user tracking.

employee call in and give trip information to be saved in database

I would like to code something up where my employees can call in and Watson will ask them the important questions, and they can just tell Watson the information and Watson then output that information into a CSV, XLS or etc. format possibly even a database.
It seems that I should be able to do this because of the way it can converse with people through messenger etc.
I know it is probably a 3 pronged approach.
Ideas?
#Florentino DeLaguna, in this case, you can use Conversation Service and Text to Speech and Speech to text API's from IBM Watson. See options you can use for that:
In theory, you would have to built an application that integrates with one URA (using Asterisk for example), convert the Speech to Text, send that text for Conversation Service, and the response of the Conversation you would have to transform into voice and send to the URA . In practice, there are some conversational problems, especially from Speech to Text. But the return voice you can use some effects using the IBM Watson Text to Speech (faster and slower voices, control of pauses, put emotions ...).
Obs.: The URA audios are in narrowband, 8khz, and most Speech to Text services only accept broadband, 16khz.
Obs II.: You app (like Asterisk) need to be able to consume a REST API and / or make use of Web Sockets then it will be able to invoke the Watson Speech to Text service.
Another option is to route a call out of Asterisk to the new IBM Voice Gateway which is a SIP endpoint that fronts a Watson self-service agent by orchestrating Speech To Text, Text To Speech and the Watson Conversation service. You can think of IBM Voice Gateway as a stand alone, Cognitive IVR system. Go here for more details.
Another potential option is to use MRCP. IBM has a services solution that will allow you to reach the Watson STT and TTS engines using MRCP. Not sure if Asterisk supports MRCP but that is typically how traditional IVRs integrate with ASRs.
Important: The options 2 and 3 are answered for another person, see the official answer.
See more about these API's:
Speech to Text
Text to Speech
Conversation
Have a look to the Voximal solution, it integrates all the SpeechToText Cloud API (and TextToSpeech) as an Asterisk application throw a VoiceXML standard browser.
All is integrated in the VoiceXML interpreter, you got the full text result of the transcription, and you can push it to a chatbot to detect the intent of the users and pick dynamic parameters like date, number, city, and more... for example by using api.ai.
Voximal supports the STT from Google, Microsoft, IBM/watson (and soon Amazon).
The 3 API listed by Sayuri are embedded in the solution.

Send data from silverlight application to a particular user in mvc application

I have an mvc application where you have to be registered to log in. So you have your profile page and etc... As a part of the application I have a chat for all users (silverlight page). You enter the chat from your profile page. You have a list of online users there (who logged in on the site and not necessary entered the chat page) but exchange messages you can just with users who did enter the chat page. Otherwise they won't see them. Here is my problem.
So, I want to realize next functionality: if user who didn't enter the chat page was chosen by another to start conversation in chat (silverlight), he should get a pop up message on HIS profile page (mvc) with invitation to the chat. I understand how to send data from silverlight to mvc through WebClient and json format. I don't understand how to reach particular user's profile page. I can think just about some database table with this kind of invitations and some timer on the profile page to check on them.
Please express your ideas, thoughts or opinions in this matter. I would appreciate any help. Thanks in advance.
Maybe you should check out SignalR.
Like you mentioned you need some way to signal another client about the message by checking periodically. SignalR does this for you and in modern browsers uses Websockets to make it even better.
It is very easy to setup and you can build a chat application in a few lines of code. Without the need to save messages in a database (when you don't want to).
Check out this blog post for a sample chat application:
http://www.hanselman.com/blog/AsynchronousScalableWebApplicationsWithRealtimePersistentLongrunningConnectionsWithSignalR.aspx

Resources