I'd like to know if it's possible use the Visual Recognition API physically, as an industrial conveyor that differentiates objects to select them or as in a robot for example.
Your camera needs to have a resolution to allow a trained Visual Recognition instance to be able to differentiate between different objects in different physical orientations. The Visual Recognition instance needs to be trained with both "good" and "not good" images. If you are looking at objects on a conveyor, you need to train with "good" images of the object in question in a variety of different orientations (on it's side, on it's back, etc.), and a variety of "not good" images (again in different orientations). Once you have your instance trained, then you simply call the API and pass in the image that you want to "look" at, and it will return what it "thinks" the object is, based on the training that you did.
Related
I am trying to make a chatbot. all the chatbots are made of structure data. I looked Rasa, IBM watson and other famous bots. Is there any ways that we can convert the un-structured data into some sort of structure, which can be used for bot training? Let's consider bellow paragraph-
Packaging unit
A packaging unit is used to combine a certain quantity of identical items to form a group. The quantity specified here is then used when printing the item labels so that you do not have to label items individually when the items are not managed by serial number or by batch. You can also specify the dimensions of the packaging unit here and enable and disable them separately for each item.
It is possible to store several EAN numbers per packaging unit since these numbers may differ for each packaging unit even when the packaging units are identical. These settings can be found on the Miscellaneous tab:
There are also two more settings in the system settings that are relevant to mobile data entry:
When creating a new item, the item label should be printed automatically. For this reason, we have added the option ‘Print item label when creating new storage locations’ to the settings. When using mobile data entry devices, every item should be assigned to a storage location, where an item label is subsequently printed that should be applied to the shelf in the warehouse to help identify the item faster.
how to make the bot from such a data any lead would be highly appreciated. Thanks!
is this idea in picture will work?just_a_thought
The data you are showing seems to be a good candidate for a passage search. Basically, you would like to answer user question by the most relevant paragraph found in your training data. This uses-case is handled by Watson Discovery service that can analyze unstructured data as you are providing and then you can query the service with input text and the service answers with the closest passage found in the data.
From my experience you also get a good results by implementing your own custom TF/IDF algorithm tailored for your use-case (TF/IDF is a nice similarity search tackling e.g. the stopwords for you).
Now if your goal would be to bootstrap a rule based chatbot using these kind of data then these data are not that ideal. For rule-based chatbot the best data would be some actual conversations between users asking questions about the target domain and the answers by some subject matter expert. Using these data you might be able to at least do some analysis helping you to pinpoint the relevant topics and domains the chatbot should handle however - I think - you will have hard time using these data to bootstrap a set of intents (questions the users will ask) for the rule based chatbot.
TLDR
If I would like to use Watson service, I would start with Watson Discovery. Alternatively, I would implement my own search algorithm starting with TF/IDF (which maps rather nicely to your proposed solution).
Can we use the newly launched Microsoft cognitive services for crowd analysis and audience measurement? I need to create an application which can detect faces in live video and provide the characteristics like gender, age and mood
Face API is designed for image processing, if you want to use it on camera stream, you need to handle the input yourself. Something like just picking several frames and send the image to Face API cloud service. If possible, you can have this [1] as reference (though the code might be a little bit old).
[1] https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis
I'm planing to make android app the need of it is to recognize building in the city .
I need help in choose the most important unique features in the buildings ,such that the size of the features stay small as it will not be a practical application if the size of the database become large , is running the application offline possible or i should send the features to remote server for processing the similarity between the pictures ?
Actually, you could choose some simple but effective features(building logo, the foreground of the building) for recognizing offline.
In order to make the result more accuracy, you cound send the GPS information back to the server.
Do not only recognize the building by analyzing the picture, sometimes ,you can get more information from the Android.
Good luck.
I want to have an Auto Speech Recognizer with the trained platform i.e. voice mods.
for eg:-
i have two words very similar in saying so the system must listen to the compplete word and in any dilax and verify it and give the output.
How to do it.
I have searched but i'm completely blank on this point.
Which technology do you want to use? There are different frameworks available out there, e.g. the Dragonfly framework (https://code.google.com/p/dragonfly) or the System.Speech.Recognition namespace for .net projects. For mobile devices you could take a closer look at the speech recognition API offered by Google.
In this point of view, fine tuning with Android speech recognition API is not possible.
you may need to start from the scratch to do this..
if you want to keep using google speech recognition API, then you need to do post processing... this called NLU (Natural Language Understanding) or NLP (Natural Language Processing).
simple concept is whatever STT (speech to text) result came from google API, you need to grouping them into one final output. what so ever your different accent or intonation to be one. or this process has value when it needs some contents to understand and do some action like what is the weather in Seoul?
Going back to your question, fine tuning for differentiating similar pronunciation words needs to have AM (Acoustic Model) and LM (Language Model) which was trained that kinds of words set accordingly. So you need to train the model from the scratch or using exist model with acoustic model adaptation will also works.
for good start point with opensource is HTK or Sphinx. If you have budget to buy, then AT&T's watson is the best tools for speech recognition area so far.
I think you should take a different approach, that is simpler than trying to get Sphinx to work.
Use a phonetic matching algorithm like soundEx to find if the user is more likely to have said one word or the other. I would modify the soundEx algorithm to make it easier to match strings. If your words of different enough it should do a good job.
Here is some code to do it
If I want to implement CMS for Mobile Devices, what kind of points should take into account?
For example, make page size smaller, use optimized (small) pictures. Any other ideas?
Also what kind of rules can be applied while converting web-pages that WERE designed for Desktop Browsers, to the ones that are easily displayed in Mobile Browsers.
I know that Mobile Devices widely vary in there capacity and property, but still trying list out some rules.
Also any other ideas, suggestions, questions and advices are welcome on this topic.
Thanks for your opinions and answers.
Short foreword, all the things I'm listing below are something the main product of the company I work for already does or has worked out a solution for, the whole goal of this answer is to give you pointers.
Identifying the phone
When dealing with mobile as a web context, it's absolutely imperative you identify the phone correctly. That should be the highest priority. Here's couple of issues with identifying phones and their features:
Do not use userAgent.contains("iPhone") detection scheme. There's already loads of web bots and other applications which contain iPhone in their user agent string and thus you'd identify them incorrectly.
Not all phones even send User-Agent headers. However some of those send UAProf URL:s which contain all the phone's features in RDF format. Unfortunately this introduces the next two problems:
Obviously you won't have access to every single device data out there and you're bound to use public data repositories such as WURFL. These databases are however incomplete, slightly lagging behind or don't contain data you'd like to have. They are your best bet for initial data set though.
UAProfs lie. Yes, they contain false information - lots of it! Partly this is because the manufacturers forget to update the XML:s and partly because the UAProf files are written during the development of the phone and as we know, features do change during development.
When relying on a feature, make sure you're not relying on a specific version of a specific phone. For example BlackBerry has a feature called Tile which is basically a really fancy bookmark but you can't just serve it to all the BlackBerry phones, you have to identify the operating system version of the actual phone to serve the right variation of the Tile. Same goes for touch screen, iPhone wasn't the first one with touch screen and most certainly isn't the only one either - also don't expect a situation where the device has only one form of input, for example Nokia N900 has a touch screen, physical keyboard and even stylus.
Creating the actual pages
Thankfully this is something people have agreed upon and when creating the pages, you're supposed to use XHTML-MP. But oh how one would wish things were this easy...
All phones have differing level of XHTML-MP/CSS support. As an example, if I remember correctly, some older BlackBerries don't support background-color for block elements. Or header tags. We've also seen incorrect ordering of span elements when there's several in a row. Oh and for some reason tables are really hard. Basically, you have to go low on markup/styling tricks.
You can't test the existence of the feature by using the feature itself. If you want to detect JavaScript support, you could think that adding a bit of JavaScript to the page for that purpose alone would work, right? Nope, that crashes a significant percent of mobile phones visiting your site. Sure, new phones don't crash but not everyone has bought their phones in the last 12 months. Also mobile specific JavaScript API:s differ per manufacturer, as yet another example there's currently at least three different API:s for JavaScript-based geolocation detection, none of them interoperable with the other ones.
Add all these on top of normal CMS features (security, content management and transformation, caching, modularity, visitor tracking and whatnot) and you should have some sort of picture of how everything affects everything and how you really should consider the cost of making your own.
In fact even though this is sort of against the general spirit of SO, I'd strongly suggest for you to get a readily made solution such as ours and use that instead for your site building needs. After all, our product has seven years worth of specialized development under its hood.
A couple that we used ...
A cms targeted for mobile devices should be able to detect the device type and detect (or have a database of) screen resolutions so that content, particularly images, can be scaled appropriately.
The rendering engine should also be able to determine if the device can handle HTML or WAP and switch markup languages appropriately.
Paging capability on the output as opposed to rendering very large pages (if content mages are large) is also helpful.
Clean integration with the corresponding web site CMS (so content doesn't need to be dual produced) is also helpful if tere is, in fact, a corresponding large form web site.