do you know any good set of training images for my test neural network
preferably a tagged set of images of numbers or letters
or simple symbols
faces or real images might be too complex at this stage.
(i am tiring to implement a Boltzmann machine)
The UCI Machine Learning Repository has a bunch of different sets of training data, including handwritten digits, for example Optical Recognition of Handwritten Digits Data Set
Another large repository of datasets, organized by application domain (classification, regression, segmentation, ...) is MLcomp. It also allow you to compare the performance of your algorithm with many other standard methods.
Related
For my master's thesis, I will have to do an inference with a pre-built / pre-trained (with TensorFlow) deep neural network model. I received it in two different formats (hdf5 / h5 and frozen graph = .pb). The inference shall be done on a cluster, so far we only have a GPU-version (with TensorRT and a uff Model) running. So my first job seems to be to do inference on one CPU before making a usage possible on the cluster.
We are using the model within computational fluid dynamics (CFD) simulations – that is also my academic background, and as you can therefore imagine I have only a little knowledge about deep learning. Anyway, it is not my job to change/train the model but just to use it for inference. Our CFD-Code is written in C++, which is the only programming language I am using on an advanced level (obviously it is no problem to use C, but I have no idea of python).
After going through many Google searches I recognized that I do not have a real idea how to start things off. I thought it would be possible to skip all the training and TensorFlow stuff. I know how neural networks work and how they calculate their output values from their input values. I also have the most important theoretical knowledge, but no programming knowledge in this field. Is it somehow possible to use the model they gave me (so either hdf5/h5 or frozen graph) and build an inference code using exclusively C or C++? I already found the C API and installed it within a docker container (where I also have Tensorflow), but I am really not sure what the next step is. What can I do with the C API? How would you write a C/C++-Code for inference with a DNN-model that is prepared to inference with it?
Opencv provided tools to run deep learning models but they are just limited to computer vision field. See here.
You can perform classification, object detection, face detection, text detection, segmentation, and so on by using the API provided by opencv. These examples are fairly straightforward.
There are both python version and c++ version available.
I wish to train a robot about the indoor environment with multiple obstacles. To save the trained data by this, is there anyway in verilog..!! This training data should be used by robot while deploying with dynamic data to move from one point to another in the trained indoor environment. What will be the best way of implementation in Verilog..!!
Verilog is a Hardware Design Language. Thus you should think it terms of 'hardware'. There does not exist something like a "hardware database".
The nearest and simplest I can think off, would be to make an SD-card interface and write raw data sectors to the SD-card.
You will then need a special program to extract the data as computers like to see a FAT32-data structure.
By the way: the option to read or write files only exists in a simulation environment. I don't know any synthesis tool which supports file I/O.
Currently, I got a position to work as a data scientist on ML. my question is as follows, is it possible to train an algorithm directly from mySQL database and is there a similarity with the way you train it from an csv file. moreover, I would like to know if you are working on very unbalanced dataset. when you use for instance 0.2 percentage of the data for testing, does it divides the proportion of the negative and positive cases in the training and the testing in equal proportion. Can any one propose me either a good tutorial or documentation?
Sure you can train your model, directly from the database. This is what happens all around in production systems. Your software should be designed, that is does not matter if your data source is SQL, csv or whatever. As you don´t mention the programming language, it is hard to say, how to do it, but in python you can take a look here: How do I connect to a MySQL Database in Python?
If your data set is unbalanced, like it is often in reality, you can use class weights to make your classifier aware of that. e.G. in keras/sci-kit learn you can just pass the class_weights parameter. Be aware that if your data set is too small, you can run into problems with default measures like accuracy. Better take a look at the confusion matrix or other metrics like the Matthews correlation coefficient
Another good reference:
How does the class_weight parameter in scikit-learn work?
I have only little background knowledge about Neural Networks (NN).
However, up to know I learnt, that training the network is the actual expensive part. Processing data by an already trained network is much cheaper/faster, ultimately.
Still, I'm not entirely sure what the expensive parts are within the processing chain. As far as I know, it's mostly Matrix-Multiplication for standard layers. Not the cheapest operation, but definitly doable. On top, there are are other layers, like max-pooling, or activation-functions at each node, which might have higher complexities. Are those the bottle-necks?
Now, I wonder if "simple" Hardware provided by Smartphones or even cheap stand-alone Hardware like Raspberry PIs are capable of utilizing a (convolutional-) Neuronal Networks to do, for example, Image Processing, like Object Detection. Of course, I mean doing the calculations on the device itself, not by transmitting the data to a second, powerful machine or even a cloud, which does the calculations, before sending back the results to the smartphone.
If so, what are the maximum Neurons such a Network should have (e.g. how many layers and how many neurons per layer), roughly estimated. And last, are there any good either projects, or librarys, using NNs for reduced simpler Hardware?
Current neural networks use convolutional layers, which perform a convolution on the input image. Also the high amount of parameters and dimensions is a real problem for low budget hardware. But anyways there are approaches that work on android for newer smartphones, like the SqueezeNet. Much of the work is actually done on gpus nowadays and so I am not sure if it works on a rasperry.
A better description than I could ever write on the topic can be found here: https://hackernoon.com/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3?gi=adb83ae18a85, where they actually built a neural network for a mobile phone. You can actually download the app and try it on your mobile phone if you have android or ios.
There are a lot of research going on in this area. There are roughly two lines of areas that deal with this:
Efficient network architectures
Some post-processing for an already trained model
You can have a look at ICNet Figure 1, where some architectures for fast inference for semantic segmentation are shown. Many of these models can be tweaked to do classification or other image processing tasks in real time. These models all have a low number of parameters compared to other networks and can be evaluated on embedded platforms.
For "post-hoc" optimizations you can look at TensorFlows Graph Transform Tool that do many of such for you: Graph Transform Tool
or maybe look into the paper by Song Han Deep Compression where many of these ideas are described. Song Han has also given many great lectures in this area and you can e.g. find one in the CS231n class at Stanford.
The speed of the inference phase depend on a lot of other things than the number of parameters or neurons. So I don't think there is a rule of thumb for saying how many neurons are the maximum.
What do you use?
Fast Artificial Neural Network Library (FANN) is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. PHP, C++, .NET, Ada, Python, Delphi, Octave, Ruby, Prolog Pure Data and Mathematica bindings are available.
FannTool A graphical user interface is also available for the library.
There are a lot of different network simulators dependant on how detailed you want to do your sim, and what kind of network you want to simulate.
NEURON and GENESIS are good if you want to simulate full biological networks (Which I'm geussing you probably don't) even down to the behaviour of dendrites etc.
NEST and SPLIT and some others are good for doing population simulations where you create the population on a node-by-node basis and see what the whole population does. This is pretty much the 'industry' standard approach, and is used a lot in research and commercial applications, so there are worth looking into. I know that IBM use SPLIT for some of their research.
MIIND is good if you want to use differential equations to model what a population would do, but this approach is relatively new and computationally expensive (if very cool).
Not sure if that is exactly what you wanted!
(N.B. if you google any of the names in caps along with the word "simulator" you will end up at the relevant web page =)
Whenever I've wanted to play around with any data mining algorithm quickly, I just load up Weka. It's pretty complex but it implements a lot of algorithms (including neural networks) with a lot of customizability. Plus, it has some visualizations for NNs.
It is old, but I have always used NeuroShell 2 when not using my own code. Unfortunately, it is not free. I think The newer NeuroShells are designed only for predicting stocks.
If you're looking to experiment with deep learning, you should look into
Theano
Pylearn2 (which is based on Theano)