How can I do the inference for a trained neural network using C/C++? - c

For my master's thesis, I will have to do an inference with a pre-built / pre-trained (with TensorFlow) deep neural network model. I received it in two different formats (hdf5 / h5 and frozen graph = .pb). The inference shall be done on a cluster, so far we only have a GPU-version (with TensorRT and a uff Model) running. So my first job seems to be to do inference on one CPU before making a usage possible on the cluster.
We are using the model within computational fluid dynamics (CFD) simulations – that is also my academic background, and as you can therefore imagine I have only a little knowledge about deep learning. Anyway, it is not my job to change/train the model but just to use it for inference. Our CFD-Code is written in C++, which is the only programming language I am using on an advanced level (obviously it is no problem to use C, but I have no idea of python).
After going through many Google searches I recognized that I do not have a real idea how to start things off. I thought it would be possible to skip all the training and TensorFlow stuff. I know how neural networks work and how they calculate their output values from their input values. I also have the most important theoretical knowledge, but no programming knowledge in this field. Is it somehow possible to use the model they gave me (so either hdf5/h5 or frozen graph) and build an inference code using exclusively C or C++? I already found the C API and installed it within a docker container (where I also have Tensorflow), but I am really not sure what the next step is. What can I do with the C API? How would you write a C/C++-Code for inference with a DNN-model that is prepared to inference with it?

Opencv provided tools to run deep learning models but they are just limited to computer vision field. See here.
You can perform classification, object detection, face detection, text detection, segmentation, and so on by using the API provided by opencv. These examples are fairly straightforward.
There are both python version and c++ version available.

Related

NLP libraries for simple POS tagging

I'm a student who's working on a summer project in NLP. I'm fairly new to the field, so I apologize if there's a really obvious solution. The project is in C, both due to my familiarity with it, and the computationally intensive nature of the project (my corpus is a plaintext dump of wikipedia).
I'm working on an approach to relationship extraction, exploiting the consistency principle to try to learn (to within some error threshold) a set of rules dictating which clusters of grammar objects imply a connection between those objects.
One of the first steps in the algorithm involves finding the set of all possible grammar objects a given word can refer to (POS disambiguation is done implicitly by the algorithm at a later step). I've looked at several parsers, but they all seem to do the disambiguation step themselves, which (from my end) is counterproductive. I'm looking for something off the shelf that (ideally) gives me a one-command way to turn up this information.
Does such a thing exist? If not, is there an existent dictionary containing this information that's trivially machine parseable?
Thank you for your help.
Look at CMU Sphinx. An open source NLP project. I think its in C++ but you can integrate it or at least get the idea of how to go about things.
What about calling an external POS tagger as a shell script or wrapping it in an http service if you feel frisky?
Java and Python have the vast majority of NLP libraries so it makes sense to take advantage of that. If you can use NLTK in a script to tag stuff, call this script from C, that makes it much easier.

What is the best approach of creating a talking bot?

When creating a AI talking bot what kind of methods of design should I use? Should it be one function, multiple modules, should it have classes?
Understanding language is complicated, so the goal you need to determine first is what aspect of language you want to understand.
An AI must be able to understand what the person says to it, then relate it to what it already knows, and then generate a legitimate response.
These three steps can all be thought of as nearly independent, so you need to address each on its own.
The brain, the world's best language processor, uses a Neural Network, but that's not likely to work well for you.
A logic-based proof solving system, where facts that follow from facts are derived would probably work best, and I know of at least one system that uses it fairly effectively.
I'd start with an existing AI program (like the famous Eliza) and run its output through a speech synthesizer.
Some source for Eliza is available here. One open source speech synthisizer is FreeTTS.
If you're using a language other than Java, there are similar candidates AI bots and text-to-speech code out there.
I've started to do some work in this space using this open source project called Talkify:
https://github.com/manthanhd/talkify
It is a bot framework intended to help orchestrate flow of information between bot providers like Microsoft (Skype), Facebook (Messenger) etc and your backend services. The framework doesn't really provide implementation for the bot providers yet but does provide hooks into its natural language recognition engine.
The built in natural language recognition library can be used to classify sentences to topics which you can then map to skill functions.
Give it a try! I'd really like people's input to see if how it can be improved.

How to generate sequence diagram for my Native (C, C++) code?

I would like to know how to generate a sequence diagram for my Native (C, C++) code. I have written my C code using vim editor.
Thanks,
Sen
First of all, sequence diagram is an object oriented concept. It is meant to convey, at a glance, message passing between objects in an object oriented program in a sequential fashion, which is supposed to help understand time-considerate interaction between the objects. As such, it does not make sense to talk about sequence diagrams in the context of a procedural language like C.
When it comes to C++, sequence diagrams are defined in the general sense by the UML specification, which is the same for all object oriented languages. UML is considered a higher-level concept from source code that looks the same for all languages, and the process of converting source code to UML is called code reverse engineering. There are tools that allow you to convert source code of Java, C++ and other languages into UML diagrams that show relationships between classes, like Enterprise Architect, Visual Paradigm and IBM Rational Software Architect.
A sequence diagram, however, is a special kind of a UML diagram and it turns out that reverse engineering a sequence diagram is quite challenging. First, if you wanted to generate a sequence diagram through static analysis, one of the first questions you must answer is whether, given two objects and a message passed between them, a result is ever returned. This means that, given a method, you would have to analyze its algorithm and figure out if it loops forever or it returns. This is known as the halting problem and has been proven to be undecidable in computer science. This means that in order to produce a sequence diagram through static analysis, you would have to sacrifice accuracy. Dynamic analysis works by actually running the code and mapping the interactions between the objects at run time. This presents its own challenges. First, you would have to instrument the code. Then, filtering out the interactions you are interested in from library and system calls and other fluff present in the code would not be doable without user intervention.
This is not to say that creating a tool that would produce usable sequence diagrams is not possible, but the market interest has apparently not been strong enough to justify the effort, and apart from a few research papers on the subject, like CPP2XMI, I'm not aware of any commercially available tools to reverse engineer C++ into sequence diagrams.
Compounding the problem is the fact that C++ is one of the most complex object oriented languages around, so even if somebody devised a good way of reverse engineering sequence diagrams, C++ would be the last language to receive the treatment. Case in point: Visual Paradigm offers rudimentary support for reversing Java code into sequence diagrams, but not for C++.
Even if such a tool existed for C++, the sad truth is that if your C++ code is complex enough that you would rather use a tool to make a sequence diagram for it instead of doing it manually, then it is most likely too complex for the tool to give you anything useful and you would have to fix it up yourself anyways.
You can try CppDepend which provides the Dependency graph and the dependency matrix to explore the dependencies between directories, files and functions.
Have you tried with plantuml? It works really well with Doxygen, I use it at work with the company template and the syntax it's really easy, you have to write the call sequence yourself though. There are plenty examples in the page, if you are working in Linux you can use your native packaging tool to install it, the same applies to Doxygen (e.g. sudo apt-get plantuml). Otherwise if you are using Windows you can use the installers from the official pages too.
You'll have to do some configuration but it's pretty straightforward, I'll leave you the links to each tool.
Download pages:
http://plantuml.com/download
http://www.doxygen.nl/download.html
Plantuml examples:
http://plantuml.com/sequence-diagram
You can find the documentation in each page, for plantmul you use java executable (.jar) then you don't have to install nothing, you just need to configure doxygen to find the executable, you can find how in the doxygen documentation page:
http://www.doxygen.nl/manual/index.html
If you want to configure it without reading the documentation you could also watch this video:
https://www.youtube.com/watch?v=LZ5E4vEhsKs
I hope this helps, cheers.
You could explore trace2uml with works with doxygen.

What programming language is used to IMPLEMENT google algorithm?

It is known that google has best searching & indexing algorithm.
The also have good relevancy.
They are also quicker in getting down the latest results.
All that's fine.
What programming language (c, c++, java, etc...) & database (oracle, MySQL, etc...) have they used in achieving this (since they have to manipulate with volume of data quickly and effectively)?.
Though I'm not looking for their in-depth architecture (if in case violates their company policies) an overview of all such things could be useful.
Anybody please add you valuable suggestions and insight on this?
Google internally use C++, Java and Python. See Rhino on Rails:
One of the (hundreds of) cool things
about working for Google is that they
let teams experiment, as long as it's
done within certain broad and
well-defined boundaries. One of the
fences in this big playground is your
choice of programming language. You
have to play inside the fence defined
by C++, Java, Python, and JavaScript.
Google's search algorithm is essentially MapReduce, which stems from functional programming techniques, implemented in C++.
Google has its own storage mechanism for this called the Google File System.
Mainly pigeons:
PigeonRank's success relies primarily on the superior trainability of the domestic pigeon (Columba livia) and its unique capacity to recognize objects regardless of spatial orientation. The common gray pigeon can easily distinguish among items displaying only the minutest differences, an ability that enables it to select relevant web sites from among thousands of similar pages.
Relevance of search results is governed by quality of information retrieval algorithms they use, not the programming language.
But C++ is what most of their backend code is written in (for most services).
They don't use any off-the-shelf RDBMS products for data storage. All of that is written in-house.
Check it out, the Bigtable.

What Artificial Neural Network or 'Biological' Neural Network library/software do you use?

What do you use?
Fast Artificial Neural Network Library (FANN) is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. PHP, C++, .NET, Ada, Python, Delphi, Octave, Ruby, Prolog Pure Data and Mathematica bindings are available.
FannTool A graphical user interface is also available for the library.
There are a lot of different network simulators dependant on how detailed you want to do your sim, and what kind of network you want to simulate.
NEURON and GENESIS are good if you want to simulate full biological networks (Which I'm geussing you probably don't) even down to the behaviour of dendrites etc.
NEST and SPLIT and some others are good for doing population simulations where you create the population on a node-by-node basis and see what the whole population does. This is pretty much the 'industry' standard approach, and is used a lot in research and commercial applications, so there are worth looking into. I know that IBM use SPLIT for some of their research.
MIIND is good if you want to use differential equations to model what a population would do, but this approach is relatively new and computationally expensive (if very cool).
Not sure if that is exactly what you wanted!
(N.B. if you google any of the names in caps along with the word "simulator" you will end up at the relevant web page =)
Whenever I've wanted to play around with any data mining algorithm quickly, I just load up Weka. It's pretty complex but it implements a lot of algorithms (including neural networks) with a lot of customizability. Plus, it has some visualizations for NNs.
It is old, but I have always used NeuroShell 2 when not using my own code. Unfortunately, it is not free. I think The newer NeuroShells are designed only for predicting stocks.
If you're looking to experiment with deep learning, you should look into
Theano
Pylearn2 (which is based on Theano)

Resources