How can I handle sparse features with deep neural network for classification - sparse-matrix

I am designing a neural network to classify a class imbalanced sparse features data. It’s a binary classification problem. I used SMOTE to resolve the class imbalance issues. How to deal with the sparse features with neural network.

Related

Is there a concept of neural networks that edit neural networks?

I think that neural networks can edit their own neural networks, and if we combine them with evolutionary algorithms, we can make a strong artificial intelligence.
Is there a concept of neural network editing itself? Neural networks can edit themselves.
There's really no point to this. Gradient descent is already a very good optimizer. There is something like this, called AutoML though, that basically uses another neural network to edit a neural network's hyperparameters...
If a neural network's only input was it's own weights, then how does it know that the error of the neural network is supposed to go down, then?

How computational expensive is running data through a Neural Network? Can Smartphones or Raspberry PIs do it?

I have only little background knowledge about Neural Networks (NN).
However, up to know I learnt, that training the network is the actual expensive part. Processing data by an already trained network is much cheaper/faster, ultimately.
Still, I'm not entirely sure what the expensive parts are within the processing chain. As far as I know, it's mostly Matrix-Multiplication for standard layers. Not the cheapest operation, but definitly doable. On top, there are are other layers, like max-pooling, or activation-functions at each node, which might have higher complexities. Are those the bottle-necks?
Now, I wonder if "simple" Hardware provided by Smartphones or even cheap stand-alone Hardware like Raspberry PIs are capable of utilizing a (convolutional-) Neuronal Networks to do, for example, Image Processing, like Object Detection. Of course, I mean doing the calculations on the device itself, not by transmitting the data to a second, powerful machine or even a cloud, which does the calculations, before sending back the results to the smartphone.
If so, what are the maximum Neurons such a Network should have (e.g. how many layers and how many neurons per layer), roughly estimated. And last, are there any good either projects, or librarys, using NNs for reduced simpler Hardware?
Current neural networks use convolutional layers, which perform a convolution on the input image. Also the high amount of parameters and dimensions is a real problem for low budget hardware. But anyways there are approaches that work on android for newer smartphones, like the SqueezeNet. Much of the work is actually done on gpus nowadays and so I am not sure if it works on a rasperry.
A better description than I could ever write on the topic can be found here: https://hackernoon.com/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3?gi=adb83ae18a85, where they actually built a neural network for a mobile phone. You can actually download the app and try it on your mobile phone if you have android or ios.
There are a lot of research going on in this area. There are roughly two lines of areas that deal with this:
Efficient network architectures
Some post-processing for an already trained model
You can have a look at ICNet Figure 1, where some architectures for fast inference for semantic segmentation are shown. Many of these models can be tweaked to do classification or other image processing tasks in real time. These models all have a low number of parameters compared to other networks and can be evaluated on embedded platforms.
For "post-hoc" optimizations you can look at TensorFlows Graph Transform Tool that do many of such for you: Graph Transform Tool
or maybe look into the paper by Song Han Deep Compression where many of these ideas are described. Song Han has also given many great lectures in this area and you can e.g. find one in the CS231n class at Stanford.
The speed of the inference phase depend on a lot of other things than the number of parameters or neurons. So I don't think there is a rule of thumb for saying how many neurons are the maximum.

Using Neural Networks Without Training Them

My task for my university assignment is to create AI for a "MOBA" style strategy game. I have looked into using neural networks for this. I cannot see any need to train the network beforehand.
In other words, would it still be considered as a neural network if I hard code in the weights and simply apply minor weight changes at runtime?
Neural network is one thing (a structure of neurons, synapses, etc.) and the learning algorithm is a completely different thing.
So if you ask whether a NN without learning algorithm can still be called NN I think the answer is yes, it can.

Is a neural network a lazy or eager learning method?

Is a neural network a lazy or eager learning method? Different web pages say different things so I want to get a solid answer with good literature to back it up. The most obvious book to look in would be Mitchell's famous Machine Learning book but skimming through the whole thing I can't see the answer. Thanks :).
Looking at the definition of the terms lazy and eager learning, and knowing how a neural network works, I believe that it is clear that it is eager. A trained network is a generalisation function, all the weights and paths used to arrive at a classification are entirely determined by training data, but the training data itself is not retained for the purposes of the decision making.
An important distinction is that a Lazy system stores its training data and uses it directly to determine a solution. An eager system determines a function from the training data, and thereafter the training data is no longer required. That is to say you cannot determine what the training data was from an eager system's function. A neural network certainly fits that description. An eager system can therfore be very storage efficient, but conversely is non-deterministic, in the sense that it is not possible to determine how or why it arrived a a particular solution, so problems of poor or inappropriate training data may be difficult deal with.
The eager article linked above even gives artificial neural networks as an example. You might of course prefer a cited text to Wikipedia but the page has existed with that assertion since 2007 without contradictory edits, so I'd say that was pretty robust.
Some neural networks are eager learners, and some are lazy. Feedforward neural networks (as are commonly trained by some variant of backpropagation) are eager: they attempt to derive a representation of the underlying relationships in the data at the time of training. Radial basis function networks (such as probabilistic NN or generalized regression NN), on the other hand, are lazy learners (very much like k-nearest neighbors, the classic lazy learner).
A neural network is generally considered to be an "eager" learning method.
"Eager" learning methods are models that learn from the training data in real-time, adjusting the model parameters as new examples are presented. Neural networks are an example of an eager learning method because the model parameters are updated during the training process, as the algorithm iteratively processes the training examples. This allows the model to adapt and improve its performance as more examples are seen.
On the other hand, "lazy" learning methods, also known as instance-based or memory-based learning, only learn from the training data when a new example is presented. The model does not update its parameters during the training process but instead, it memorizes the training data and uses it to make predictions. Lazy learning methods typically require less computation time to make predictions than eager learning methods, but they may not perform as well on unseen data.
In general, neural networks are considered eager learning methods because their parameters are updated during the training process.
Here are a few literature references:
"Eager Learning vs. Lazy Learning" by R. S. Michalski, J. G. Carbonell, and T. M. Mitchell. This paper provides a comprehensive overview of the distinction between eager and lazy learning, and discusses the strengths and weaknesses of each approach. It was published in Machine Learning, 1983.
"An overview of instance-based learning algorithms" by A. K. Jain and R. C. Dubes. This book chapter provides an overview of the main concepts and techniques used in instance-based or lazy learning, and compares them to other types of learning algorithms, such as decision trees and neural networks. It was published in "Algorithms for Clustering Data" by Prentice-Hall, Inc. in 1988.
" Machine Learning" by Tom Mitchell. This book provides a comprehensive introduction to the field of machine learning, including the concepts of eager and lazy learning. It covers a wide range of topics, from supervised and unsupervised learning to deep learning and reinforcement learning. It was published by McGraw-Hill Education in 1997.
"Introduction to Machine Learning" by Alpaydin, E. This book provides an introduction to the field of machine learning, including the concepts of eager and lazy learning, as well as a broad range of machine learning algorithms. It was published by MIT press in 2010
It's also worth noting, that this classification of lazy and eager learning is not always clear cut and can be somewhat subjective, and some algorithms can belong to both categories, depending on the specific implementation.

What Artificial Neural Network or 'Biological' Neural Network library/software do you use?

What do you use?
Fast Artificial Neural Network Library (FANN) is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. PHP, C++, .NET, Ada, Python, Delphi, Octave, Ruby, Prolog Pure Data and Mathematica bindings are available.
FannTool A graphical user interface is also available for the library.
There are a lot of different network simulators dependant on how detailed you want to do your sim, and what kind of network you want to simulate.
NEURON and GENESIS are good if you want to simulate full biological networks (Which I'm geussing you probably don't) even down to the behaviour of dendrites etc.
NEST and SPLIT and some others are good for doing population simulations where you create the population on a node-by-node basis and see what the whole population does. This is pretty much the 'industry' standard approach, and is used a lot in research and commercial applications, so there are worth looking into. I know that IBM use SPLIT for some of their research.
MIIND is good if you want to use differential equations to model what a population would do, but this approach is relatively new and computationally expensive (if very cool).
Not sure if that is exactly what you wanted!
(N.B. if you google any of the names in caps along with the word "simulator" you will end up at the relevant web page =)
Whenever I've wanted to play around with any data mining algorithm quickly, I just load up Weka. It's pretty complex but it implements a lot of algorithms (including neural networks) with a lot of customizability. Plus, it has some visualizations for NNs.
It is old, but I have always used NeuroShell 2 when not using my own code. Unfortunately, it is not free. I think The newer NeuroShells are designed only for predicting stocks.
If you're looking to experiment with deep learning, you should look into
Theano
Pylearn2 (which is based on Theano)

Resources