Motion detect from voice - artificial-intelligence

Motion detect from voice - artificial-intelligence

Does anyone know a mature AI model either paid or open source which can detect the motions from a recorded voice? There are a few good ones for text but haven't seen acceptable ones for voice yet.

Related

Is there a way to assign dB value to a specific audio files for reproduction in speakers?

Most speakers would have a volume knob, but is there a way to modify it for each audio file? My biggest concern is the speaker integrity as to not send too loud of a sound. I can't find a way to get the speakers data so it can be decided if it's safe to play it or not, as I intend to make it work in every speaker. It's a broad question but I dont even know if it's possible so I dont know where to start.

Audio files should ideally be at the maximum level that does not distort. I could see engaging in preprocessing all your recordings so that they're at a consistent sound level. That might be sufficient to account for the concern you are raising about client speaker integrity.
If you want to have the database store a number that reflects how "hot" a particular audio file is, then it seems to me that the amount of effort required to get that number might be comparable or even greater than the effort to preprocess the sounds to a consistent max-with-no-overflow level.
Once you have that number, how to make use of it? The HTML <audio> tag does not have an amplitude property. So, you would have to make use of some library in your client, such as the Web Audio API, or one of the many libraries that are built on top of it. With Web Audio API, the relevant point to change the volume is the gain node.

Differences in volume of audio content

We are having a Skill built for us which plays podcasts and audio snippets of videos. We currently serve all of this content through other traditional platforms like native mobile apps and a website.
One problem we have is that the volume of all this audio content is much lower than the rest of the sounds outputted from the Alexa device. We don't notice any similar discrepancies on the other aforementioned platforms, and the developers building the Skill say that there's no API which allows you to boost or manipulate the output volume (not the system volume).
Has anyone else had experiences with this sort of issue? We are reluctant to pump up the volume of our source files as it will affect all the other places they are listened to.

Short answer - yes it's tricky and you're not alone.
As detailed in this BBC report* on designing a voice application - "All of our experts have experienced problems in audio levels when mixing and mastering audio for smart speakers."
Official guidence from Amazon is:
Program loudness for Alexa should average -14 dB LUFS/LKFS
The true-peak value should not exceed -2 dBFS
It then goes so far as to say that,
Your skill may be rejected if program loudness:
Is lower than -19 dB LUFS
Is higher than -9dB LUFS
However, this mainly seems to apply to audio played either as part of a Flash Briefing or using the AudioPlayer Interface, and you may get away with deviating from this if using it as part of SSML output.
That said, in practice developers tend to bump the levels up on audio until it sounds good. Sticking strictly to the recommendations usually renders the volume way too low.
So, if you didn't want to increase the loudness of clips as it'd affect other platforms, your best options might be to create an Alexa-version of the audio - mixed slightly louder. Though, I recognize that this would be tedious.
*In the interest of transparency, I wanted to declare that I fed into this report.

Libraries for Face Detection

I need to develop a mobile app (primarily for Android, iOS, and Windows Mobile) for face detection. Obviously, OpenCV is the most well known. However, I'm unsure about the compatibility among the different OS'es. Besies OpenCV, are there other options? 2 key requirements:
-Open source/commercial libraries but must run locally/natively in devices without internet connection so Player Service API would not work
-Capable of tracking multiple faces in motion
Anyone can share their experiences/knowledge in this area? Any pointers greatly appreciated!

You are really pushing the margin a whole lot.
Face detection generally consists of three different areas.
1) Recognizing a face as a face (there is a mouth, a nose, eyes) This is useful for focusing a snapshot.
2) Recognizing facial features, looking for emotion (mouth in a smile) or eye tracking.
3) Facial recognition. Using the system to perform identification by attaching a name to a face.
You want to use a face recognition tool to perform tracking and count people entering a particular place, using a mobile phone.
First tracking is pretty difficult. Its one thing to perform simple face identity in a single frame snap shot. That's pretty easy. The problem is, you may find your frame rates so poor that you can only accommodate 1 frame every three or even every five seconds. That will make it nearly impossible to track and count faces. Counting faces is easy, but what's hard is to determine if that face in the screen was counted previously or is a new person entering the screen.
OpenCV has a whole lot of tools and examples out there for facial recognition, image tracking, etc. I'd strongly recommend playing with OpenCV and test its capabilities. I'd recommend the C/C++ versions (unless you are already a Python programmer) Here's a place to start, a blog entry I wrote a few months ago.
I really like the tutorials from Kyle Hounslow... Look him up on youtube. His videos are well thought out, they are interesting and he provides example code for all his work. Go ahead and watch all of those videos, and repeat all of those examples. Get a feel for what is available in frame rates using a laptop.
The next part of your task is porting stuff from OpenCV to Android/iOS. That's no easy task. I'm sure folks have tried, and I'm sure helpful hints are out there.
I don't mean to dissuade you from an awesome investigation but do note what you want to do is mighty difficult. You will have to invest some time to even determine where all the difficult areas are. And unfortunately you won't know effective frame rates and performance until you build some stuff and try it.
Good luck with the journey.

PlayerPrefs v. XML File for Saving Data in Unity3D?

I searched all around Google and YouTube and I could not find a definite answer to my question.
Should I use PlayerPrefs or XML files to save the all the data that is in Version 1 of my game? Or is their any other format of saving that you prefer?
My game does not have much data that will be saved onto it, but it is not very minimal either. The things that I will save in my game are: Options Menu (Sound FX on or off, game music on or off, and the 10 second Story Animation on or off), Scores for the distance you go in my game (around 10 scores, kinda like a billboard), and a very simple store where you can unlock two more characters after you have achieved a certain distance in the game. Their is not much right now but I am going to be releasing updates in my game and it will continue to grow and expand, so I would like the savings option that can be expanded upon the best. I tried my best to explain my unique scenario, if you need more information just let me know and I will add it. Thank you! :)

I searched all around Google and YouTube and I could not find a
definite answer to my question.
That's because both of them will accomplish the work. It doesn't really matter which one you use. They both can save your game data.
Now, In my own opinion, you should use XML for portability reasons.
When you save your game data with PlayerPrefs, you will be stuck with it on that platform. For example, PlayerPrefs is saved as .xml on Android and .plist on iOS and Mac OS, .dat on Windows App Store and as a registry key on HKCU\Software\[company name]\[product name] key on Windows. These are not interchangeable.
Assuming you want to allow players to transfer all player settings from one platform to another, that would be a problem. By using xml, your game will be able to use it on any platform you want. So a player can take their settings from their Android phone and use it on their PC or Mac without problems.
When you want to change your game engine from Unity to another such as UDK, you will have a hard time using the player's old data. I've seen a question about transferring player's data from Unity to another game engine. If this person used xml, that wouldn't be a problem at all. You will get negative feedback if your players lose their data due to game engine change. If you think that you never have to change game engine, remember that you are not the founder or the CEO of Unity Technologies SF. They make their own decisions and any other competitor can buy them anytime and shut them down. This happens all the time. Stick with xml if you really care about PlayerPrefs vs Xml.

Common web problems where Neural Networks could help

I was wondering if you creative minds out there could think of some situations or applications in the web environment where Neural Networks would be suitable or an interesting spin.
Edit: Some great ideas here. I was thinking more web centric. Maybe bot detectors or AI in games.

To name a few:
Any type of recommendation system (whether it's movies, books, or targeted advertisement)
Systems where you want to adapt behaviour to user preferences (spam detection, for example)
Recognition tasks (intrusion detection)
Computer Vision oriented tasks (image classification for search engines and indexers, specific objects detection)
Natural Language Processing tasks (document/article classification, again search engines and the like)

The game located at 20q.net is one of my favorite web-based neural networks. You could adapt this idea to create a learning system that knows how to play a simple game and slowly learns how to beat humans at it. As it plays human opponents, it records data on game situations, the actions taken, and whether or not the NN won the game. Every time it plays, win or lose, it gets a little better. (Note: don't try this with too simple of a game like checkers, an overly simple game can have every possible game/combination of moves pre-computed which defeats the purpose of using the NN).
Any sort of classification system based on multiple criteria might be worth looking at. I have heard of some company developing a NN that looks at employee records and determines which ones are the least satisfied or the most likely to quit.
Neural networks are also good for doing certain types of language processing, including OCR or converting text to speech. Try creating a system that can decipher capchas, either from the graphical representation or the audio representation.

If you screen scrap or accept other sites item sales info for price comparison, NN can be used to flag possible errors in the item description for a human to then eyeball.
Often, as one example, computer hardware descriptions are wrong in what capacity, speed, features that are portrayed. Your NN will learn that generally a Video card should not contain a "Raid 10" string. If there is a trend to add Raid to GPUs then your NN will learn this over time by the eyeball-er accepting an advert to teach the NN this is now a new class of hardware.
This hardware example can be extended to other industries.

Web advertising based on consumer choice prediction
Forecasting of user's Web browsing direction in micro-scale and very short term (current session). This idea is quite similar, a generalisation, to the first one. A user browsing Web could be proposed with suggestions with other potentially interesting websites. The suggestions could be relevance-ranked according to prediction calculated in real-time during user's activity. For instance, a list of proposed links or categories or tags could be displayed in form of a cloud and font size indicates rank score. Each and every click a user makes is an input to the forecasting system, so the forecast is being constantly refined to provide user with as much accurate suggestions, in terms of match against user's interest, as possible.

Ignoring the "Common web problems" angle request but rather "interesting spin" view.
One of the many ways that a NN can be viewed/configured, is as a giant self adjusting, multi-input, multi-output kind of case flow control.
So when you want to offer match ups that are fuzzy, (not to be confused directly with fuzzy logic per se, which is another area of maths/computing) NN may offer a usable alternative.
So to save energy, you offer a lift club site, one-offs or regular trips. People enter where they are, where they want to go and at what time. Sort by city and display in browse control.
Using a NN you could, over time, offer transport owners to transport seekers by watching what owners and seekers link up. As a owner may not live in the same suburb that a seeker resides. The NN learns over time what variances in owners, seekers physical location difference appear to be acceptable. So it can then expand its search area when offering a seekers potential owners.
An idea.

Search! Recognize! Classify! Basically everything search engines do nowadays could benefit from a dose of neural networks and fuzzy logic. This applies in particular to multimedia content (e.g. content-indexing images and videos) since that's where current search technologies are lagging behind.

One thing that always amazes me is that we still don't have any pseudo-intelligent firewalling technology. Something that says "hey his range of urls is making too much requests when they are not supposed to", blocks them, and sends a report to an administrator. That could be done with a neural network.
On the nasty part of things, some virus makers could find lucrative uses to neural networks. Adaptative trojans that "recognized" credit card numbers on a hard drive (instead of looking for certain cookies) or that "learn" how to mask themselves from detectors automatically.

I've been having fun trying to implement a bot based on a neural net for the Diplomacy board game, interacting via DAIDE protocols. It turns out to be extremely tricky, so I've turned to XCS to simplify the problem.

Suppose EBay used neural nets to predict how likely a particular item was to sell; predict what the best day to list items of that type would be, suggest a starting price or "buy it now price"; or grade your description based on how likely it was to attract buyers? All of those could be useful features, if they worked well enough.

Neural net applications are great for representing discrete choices and the whole behavior of how an individual acts (or how groups of individuals act) when mucking around on the web.
Take news reading for instance:
Back in the olden days, you picked up usually one newspaper (a choice), picked a section (a choice), scanned a page and chose an article (a choice), and read the basics or the entire article (another choice).
Now you choose which news site to visit and continue as above, but now you can drop one paper, pick up another, click on ads, change sections, and keep going with few limits.
The whole use of the web and the choices people make based on their demographics, interests, experience, politics, time of day, location, etc. is a very rich area for NN application. This is especially relevant to news organizations, web page design, ad revenue, and may even be an under explored area.
Of course, it's very hard to predict what one person will do, but put 10,000 of them that are the same age, income, gender, time of day, etc. together and you might be able to predict behavior that will lead to better designs. Imagine a newspaper (or even a game) that could be scaled to people's needs based on demographics. An ad man's dream !

How about connecting users to the closest DNS, and making sure there are as few bounces as possible between the request and the destination?

Friend recommendation in social apps (Linkedin,facebook,etc)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight