How to start the actual "Speech to text"? - ibm-watson

I am a freelancing author and have gathered tons of hours of interview material which needs to be transcribed.
While browsing the Internet I came across IBM Watson "Speech to text" which should be the ideal solution to handle that huge amount of spoken word.
After registration I am struggling with even opening it. Since I am not very much equipped with programming, etc.
Can someone provide an example with steps that I can follow to achieve my task?

which platform you want to use Speech to text service on it ?

If you are not a coder, then the best starting point for you will be Node-RED. Take a look at this tutorial that creates a translator - https://developer.ibm.com/tutorials/build-universal-translator-nodered-watson-ai-services/?cm_mmc=IBMDev--Digest--ENews2019-_-email&spMailingID=39408813&spUserID=MzYzODEwODAwNzk4S0&spJobID=1500992192&spReportId=MTUwMDk5MjE5MgS2
If uses Speech to Text, Translation, and Text to Speech. You will only need the Speech to Text bit. Once you get it working with a microphone you can make use of the file inject to push your own audio files through the service.
For larger files you will need to make use of http post and multi-parts, when you get to that point, raise a new question, tag it with node-red and someone will post a sample flow for you.

You do not need to have any programming knowledge to use Watson Speech To Text. You can just send your files to the service using the curl tool. Which you can easily install in your computer, it is free.
Then you can send a file to the service running the following command:
curl -X POST -u "apikey:{apikey}" --header "Content-Type: audio/flac" --data-binary #audio-file2.flac "https://stream.watsonplatform.net/speech-to-text/api/v1/recognize"
You just need an apikey to run that command, which you can get following these steps: https://cloud.ibm.com/docs/services/watson?topic=watson-iam
Then just replace the .flac file in that command by the file you want to process. And pass the right value for the Content-Type: header. For Flac files it is audio/flac, for other audio formats you have the list here: https://cloud.ibm.com/apidocs/speech-to-text

Related

Terrible Watson transcription

I'm trying to use Watson to create the transcription of an audio file in brasilian portuguese. I made the call to the api and the result returned successfully. But the transcription is beyond terrible. It's absolutely useless, with no word being recognized correctly.
I used the following command:
curl -X POST -u "apikey:<key>" --header "Content-Type: audio/mp3" --data-binary #./file.mp3
"https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/<code>/v1/recognize/model=pt-BR_BroadbandModel"
The test audio consists in a 9 min part of a 90 mins audio. It's an interview of a resercher with a dockworker. It has been recorded with a cel phone. I have upload it here, for examination: https://drive.google.com/file/d/1Xuibxksudp55uwaz6oSOccTZ3pP7Dya9/view?usp=sharing
It's not possible that Watson has a so terrible transcription. What am I missing ? do I have to set some parameter or do some work in the audio first ?
I tried the narrowband model also. Tried flac also.
the watson ibm api seems not to be properly codd for end users, the reasons seem their api design is overcomplicated for transcriptions.
it has a bug I believe their team has not been able to decode
It is however advisable to work with google
and speechRecognition
pip install --upgrade SpeechRecognition(linux, unix systems)
or C:\path_to_ python.exe -m pip install --upgrade SpeechRecognition (windows)
this is one module that has all the built in
capacity for the different api creators such as ibm
google, microsoft etc,
just by using
import speech_recogntion as sr
r = sr.Recognizer()
with sr.AudioFile("path to audio file") as source:
#r.adjust_for_ambient_noise() depending on if you have background noise
audio = r.record(source)
then ;
recognize the file out put
where xxx is the api creator from a list. say
google, ibm, azure or bing(with microsoft)
t = r.recognize_xxx(audio, credentials, ...)
read up more on the module to be more precise
this is only a rough guide

How do I know what parameters nl80211 commands require?

My main reference is http://lxr.free-electrons.com/source/include/uapi/linux/nl80211.h
Let's say I want to call NL80211_CMD_TRIGGER_SCAN documentation says trigger a new scan with the given parameters NL80211_ATTR_TX_NO_CCK_RATE is used to decide whether to send the probe requests at CCK rate or not.
I am looking at some source I found online, but it does not work and I would
So how do I know what to put into message?
I am using libnl to comunicate with kernel
I found some answers online that put a little light on this, but it's still a dark alley to me. Here are some:
Using nl80211.h to scan access points
how to use the libnl library to trigger nl80211 commands?
I ran into the same issues working from a Python perspective. From personal experience, the iw source code sucks. You'd be better off doing
strace -e trace=network -f -x -s 4096 iw ...
I built a simple parser and copying and pasting the output, I was able to figure out what nl80211 command and attributes along with values were being sent and then see what the response was.

Using Curl To Send Info

I have to use the Curl library to send a string to a morse code translator.(http://mattfedder.com/cgi-bin/morse.pl)
Then I have to take back the result and extract the translated code.
My prof didn't explain curl very well at all and I cannot find any clear examples.
I am not by any means asking for people to code it I just need sources to examples that may help. I apologize if these are blatantly easy to find I have put time into a search just none seemed relevant.
Curl works with webpages/webservices etc.
Its library you can use to interact with web apps without writing all the code.
read this page.
http://curl.haxx.se/libcurl/c/libcurl-tutorial.html
(could not comment as i dont have 50 rep sorry)

Check if file exists on FTPS site using cURL

I am using the cURL app to download multiple csv files. I want to find a way to check if the file exists on the ftps site before kicking off the download. If it doesn't exist I would like to find a way for cURL to check again at regular intervals.
I am trying to stick to using cURL commands for this I am really not good at .Net programming. Any help would be appreciated
$ curl ftp://[host]/[path] --ssl --head
(you might also need -k)
--ssl: Try to use SSL/TLS for the connection
--head: When used on an FTP or FILE file, curl displays the file size and last modification time only
It will return an error if the file doesn't exist. It will not keep checking, it will only check once so you need to do the repeated checking using some scheduler/cron/script or whatever.

Connect to a website via HTTP in C

I have some C code that parses a file and generates another file of processed data. I now need to post these files to a website on a web server. I guess there is a way to do a HTTP POST but I have never done this in c (using GCC on Ubuntu). Does anyone know how to do this? I need a starting point as I have no clue of doing this in C. I also need to be able to authenticate with the website.
libcurl is probably a good place to start.
I think Hank Gay's suggestion of using a library to handle the details is the best one, but if you want to "do it yourself", you need to open a socket to the web server and then send your data in the HTTP POST format which is described here. Authentication can mean a variety of different things, so you need to be more specific.
Unfortunately, all of the above three jobs involve a fair bit of complexity, so you need to break the question down into stages and come back and ask about each bit separately.

Resources