TTS concatenation based on user input - vxml

Greeting StackOverflow community,
Is it possible to take what a user says or enters (like the letters 1 - 9) and instead of the text to speech engine reading the numbers back to the user it plays a prerecorded audio clip so it sounds like our voiceover person instead of the robot?
Can you do this dynamically based on what the user inputs?
All i'm really asking for is a prod in the correct direction of how to start figuring this out.

You can. I've written logic, a long time ago, that takes the desired phrase and a list of available clips to find the largest segments (clips often had multiple phrases) that could be used to assemble the audio. It tends to sound very choppy, but it is possible if you have enough prerecorded audio. In my case the content was in a niche and could be accomplished with 95% coverage with only a couple thousand recordings.
At the end, it was just basic search logic to find clips. If you do this at the word level, you could just name each clip with the word and split the input and generate the audio tags. <audio src='the.wav'/><audio src='quick.wav'/><audio src='brown.wav'/><audio src='fox.wav'/>...

Related

Realtime voice processing in discord?

Gday Guys,
me and my friends want to play Bingo with another friend of us. Basicaly every time we play together he will rage and say some hilarious stuff.
The initial plan was to write the words down in a 3x3 grid. And when he had said sth. you wrote down, you can cross it. You win if you have crossed all words diagonally, vertically or horicontically.
But I as a programmer do not want to use pen and paper (its boring). So why not write a game or so?
So I wonder, is there any known API to listen to the discord audio channels? There are some bots you can use to record the audio for podcasts. How do they access the audio? Do you guys have any idea to deal with that?

Processing and Kinect: Change speed of video according to distance from the sensor

I'm asking a question for a school project. I am trying to play a video in a loop and change its speed according to the distance from the Kinect 1414. When you are far the video plays at normal speed, which then increases as you get closer.
I tried in different ways but sometimes either the video doesn't loop or it doesn't show, I can only hear the audio. Other solutions don't let me change the speed of the video. Do you know any way to have the distance of a person affect the video?
Thanks.
Stack Overflow isn't designed for general "how do I do this" type questions like this. It's for specific "I tried X, expected Y, but got Z instead" type questions. But I'll try to help in a general sense.
You need to break your problem down into smaller pieces and then take on those pieces one at a time. For example, can you start with a very basic sketch that just displays the distance from the Kinect? Don't worry about the video yet, just display the distance on the screen.
Separately from that, can you create another sketch that just shows a video playing in a loop? Work your way up from there: can you make it so its speed is based on a hard-coded value? How about on a value like mouseX?
Get those two basic sketches working perfectly by themselves before you start thinking about combining them. Then if you get stuck on one of those steps, you can post a MCVE along with a specific question (in a new question post), and we'll go from there. Good luck.

Trying to create a Dice HUD on Second Life that pulls information for different rolls from a notecard

So I'm working on something that's going to be a rather large undertaking. I've figured out how to do a "bare-bones" kind of dice hud that just rolls a basic 2-20. However now I need to go to the next step.
I want to make a roleplaying system dice hud for my sim. For this I want it so that when you click the HUD you get a menu, that lists all the skills in my system. When you click the skill it refrences a notecard in the Hud to do some minor math before displaying the result: IE.
There's a normal 2d6, 2d8, 2d10, 2d12, 2d16, 2d20 ((Whatever basic configuration that always rolls a standard die))
Though I want it to look into a note card to add in a character's "STATS" and "SKILL LEVEL"
So say they want to hit someone with a sword?
I want the Hud to generate a random value between 2 and 12, then add in the character's Strength, speed, perception stats as well as their sword skill level.
If I could see the basics of HOW to start this I can then move forward from there.
You cannot write into a note card using a LSL (or other) script.
If you want to roll a dice, simply use llRound( llFrand ); or integer x = (integer)llFrand(19)+1;
You could use a webserver to save information like that. Just google free web space, you're gonna find a lot. SL's HTTP communication is, let's say, ok.
This is kind of a big project. If you don't know where/how to start you should hire a professional. Just look for groups in-world. You're going to find a lot of people willing to help you :)
In LSL you cannot write in a notecard, however if you try OpenSim you can use that function:
osMakeNotecard(string notecardName, list contents);
But that is only available in OpenSim, see osMakeNotecard.

How can I store a video with proper indexing

How can I store a video (either in database or file system) so that instead of starting streaming from starting I can start this streaming from any fix index.
Main aim is like I have a large video of roads of New York from one end to other and corresponding map of New York save on a central server. Now a user opens up the website and selects the two points on the map of New York and video of road between those two points starts streaming, not from starting but from first point to second point given by user.
So main requirement is to store a video with its indexes such that I can start streaming from any of the index.
Edited Part :
Actually I am planning how to store video of complete city so I can show it to user whenever he selects it on map.
So Now Main question in my mind is can I merge video for all roads in one video like various linked lists (Roads). Like if there are two turns at particular point then instead of storing two videos from that point for different path can I store them in a single video such that which video you have to play will depend upon starting and ending point selected by user and shortest path between those two points, But can I store video of all roads as a single video.
How can I do this, will it depend on stream mechanism or on storage ?
Thanks,
GG
I guess that this all depends on the capability of your playing/streaming mechanism. I would find out about these before determining how to store the file and/or "index" points. Ask some specific questions about your streaming technology, such as:
can you fast forward to a certain point?
can you stop at a certain point?
can you play one stream after one ends?
another play capabilities that may help solve this?
If you can trigger the playing of your video to fast forward to a certain point, you can store the amount of time or frames to fast forward from the beginning and associate these with your map start. You would also need to "abort" the stream at a certain point, that matches your map end point.
However, if you can not fast forward your stream, you many need to break your video file into smaller segments and start at the proper one based on the map point selected. You would then need to play multiple files until you reach the end point.

Using Silverlight 2 for short audio caching

I'm attempting to use a large number of short sound samples in a game I'm creating in Silverlight 2. The samples are less than 2 seconds long.
I would prefer to load all the audio samples onto the canvas during the initualization. I have been adding the media element to the canvas and a generic list to manage it. So far, it appears to work.
When I play the sample the first time, it plays perfectly. If it has finished playing and I want to re-use the same element, it cuts off the first part of the sound. To play the sample again, I stop and play the media element.
Is there another method I should use the samples so that the audio is not clipped and good performance is obtained?
Also, it's probably a good idea to make sure that all of your audio samples are brought down to the client side initially. Depending on how you set it up, it's possible that the MediaElements are using their progressive download functionality to get the media files from the server. While there's nothing wrong with this per se (browser caching should be helping you out after the initial download), it does mean that you have to deal with the browser cache, and there are some potential issues there.
Possible steps to try:
Mark your audio files as "Content". This will get them balled up in the .xap.
Load your audio files into MemoryStreams (see Application.GetResourceStream method) and call MediaElement.SetSource().
HTH,
Erik
Some comments:
From MSDN:
Try to limit the number of MediaElement objects you have in your application at once. If you have over one hundred MediaElement objects in your application tree, regardless of whether they are playing concurrently or not, MediaFailed events may be raised. The way to work around this is to add MediaElement objects to the tree as they are needed and remove them when they are not.
You could try to seek to the start of the sample to reset the point currently being played before re-using it with:
mediaelement.Position = new TimeSpan();
See also MSDNs MediaElement.Position.
One techique you can use, although I'm not sure how well it will work in Silverlight, is create one large file with all of your samples joined together (probably with a half-second or so of silence between each). Figure out the timecode for each sample and seek the media element to that position and play. You'll only need as many media elements as simultaneous sounds you want to play.

Resources