Are there open source expert systems with reasoning capabilities? - artificial-intelligence

For learning purposes I'd like to study an open source expert system, in particular one that can reason and explain it's reasoning. Which ones do you know?

Some open source expert systems / expert systems tools (tools you can use to write Expert Systems) include
C Language Integrated Production System (CLIPS), CLIPS is an enviroment used to make rule or o bject based expert systems
Python Knowledge Engine Pyke, Pyke allows you to use Logic Programming to make expert systems in Python
OpenExpert PHP Expert System Tool mainly focused on application for Legal Expert Systems.
d3web is Java Knowledge Base System that uses XML
jColibri Reference Platform for Case Base Reasoning Programs in Java
DTRules Decision Table based rules engine in Java
drools is a well supported Java based rule-processing engine
EulerEuler is an inference engine supporting logic based proofs.
Infosapient Java Business Rules Engine
Jena Jena is a Java framework which includes a rule-based inference engine, a ontology API f and a query engine
JEOps JEOPS adds forward chaining, first-order production rules to Java in order facilitate expert systems development using declarative programming
JLisa A CLips like rule engine with a Common Lisp interface in Java
mandarax A derivation rule compiler for Java
ofBiz Java based Business Rules Engine
OpenCycOpenCyc is the open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine.
DEX DEX is an interactive computer program for the development of qualitative multi-attribute decision models and the evaluation of options
Additional relevant resources can be found in the list here.
In terms of recommendations at which Expert Systems to look at for learning purposes, I would recommend OpenCyC. There is very intereting Google Tech Talk Computers vs Common Sense about the Cyc technology.
Without additional information and clarification it is difficult to make further recommendations.

Related

Find topics in Machine learning in gcloud

I'm new in machine learning I see some services on Google Cloud platform related to A.I I think these are easy to use.
Here is what I need I have around 20K paragraphs (3 or 4 line) I need to find the most matching paragraph according to user question. User ask any question or type any sentence I need to find the most similar paragraph related to this user sentence how can I do that. What Services I need to achieve this I want to use Google Cloud platform. is it possible in gcloud if yes then how.
I think you can start your journey looking through the GCP AI & ML product list. Narrowing down the initial request and seeking for a best match affording your custom use scenario, I would advice to get more details about GCP AutoML products, offering a variety of complete solution for a generic machine learning models such as AutoML Natural Language model specifically designed for a document and text analysis tasks.
I would encourage you to start with AutoML Natural Language beginner's guide to get more context, having look at features and capabilities like of Classification, Entity extraction, Sentiment analysis training approaches.
As from the developers perspective, Cloud AutoML Natural Language supports a client libraries for most known programming languages and offers good REST API documentation though.

Why would i need to add scripting engine in my standalone applications

I would like to know in which scenarios would embedding scripting language in my C projects help me.
I have heard about lua which the developers embed in their projects to extending their software applications but why developers prefer to extend their applications using some scripting engine rather then the primary language?
This mostly comes down to who ends up using your applications and what they are using it for. If the application needs no per-user customization, then there's no need for scripting. New functionality can be added normally as part of the application.
However, like in game engines, if the users need to create/script custom behaviour into the application there has to be some way for them to do so. You could try to make your users write their scripts in the language of the application, however in the case of C and many other languages, this requires the code for the application to be recompiled (Not to mention the fact that your users might not be programmers and could benefit from a more high level scripting language).
By adding a scripting engine, you allow your users to add their own (limited) functionality to the application without needing to understand or recompile the entire codebase.
tl;dr Scripting engines make sense if your users need routinely to add custom behaviours to the application.
Regarding your "scenario" question : adding a script engine to your standalone (C or C++ for instance) application is the most direct way to combine the performance of a dedicated engine with the expertise of your power users.
No matter the field of you standalone application, thanks to the ability to write scripts, a power user will usually be able to get the most of it by creating dedicated or project-centric workflow.
The script interface brings a safe and secure environment that is suited for such users whose primary skills is generally not to deal with C/C++.
This leads to your second question : a script API perfectly (using Lua or Squirrel, for instance) makes sense if the primary language needs low-level programming skills. Typically, and application written in C++ will require your power users to write plugins using a C++ SDK.
On the opposite, if your standalone application is written in Python, the benefit of embedding Lua is far from obious in my opinion.

What factors to consider when choosing a Multi-model DBMS? (OrientDB vs ArangoDB)

I am looking to dip my hands into the world of Multi-Model DBMS, I have no particular use cases, just want to start learning.
I find that there are two prominent ones - OrientDB vs ArangoDB, but was unable to find any meaningful comparison, unopinionated between them. Can someone shed some light on the difference in features between the two, and any caveats in using one over the other? If I learn one would I be able to easily transition to the other?
(I tagged FoundationDB as well, but it is proprietary and I probably won't consider it)
This question asks for a general comparison between OrientDB vs ArangoDB for someone looking to learn about Multi-model DBMS, and not an opinionated answer about which is better.
Disclaimer: I would no longer recommend OrientDB, see my comments below.
I can provide a slightly less biased opinion, having used both ArangoDB and OrientDB. It's still biased as I'm the author of OrientDB's node.js driver - oriento but I don't have a vested interest in either company or product, I've just necessarily used OrientDB more.
ArangoDB and OrientDB are both targeting a similar market and have a lot of similarities:
Both are multi-model, you can use them to store documents, graphs and simple key / values.
Both have support for Gremlin, but it's firmly a second class citizen compared to their own preferred query languages.
Both support server-side "stored procedures" in JavaScript. In both systems this comes via a slightly less than idiomatic JavaScript API, although ArangoDB's is a lot better. This is getting fixed in a forthcoming version of OrientDB.
Both offer REST APIs, both aim to be usable as an "API Server" via JavaScript request handlers. This is a lot more practical in ArangoDB than OrientDB.
Both are distributed under a permissive license.
Both are ACID and have transaction support, but in both the transactions are server-side operations - they're more like atomic batches of commands rather than the kinds of transactions you might be used to in a traditional RDBMS.
However, there are a lot of differences:
ArangoDB has no concept of "links", which are a very useful feature in OrientDB. They allow unidirectional relationships (just like a hyperlink on the web), without the overhead of edges.
ArangoDB is written in C++ (and JavaScript), whereas OrientDB is written in Java. Both have their advantages:
Being written in C++ means ArangoDB uses V8, the same high performance JavaScript engine that powers node.js and Google Chrome. Whereas being written in Java means OrientDB uses Nashorn, which is still fast but not the fastest. This means that ArangoDB can offer a greater level of compatibility with the node.js ecosystem compared to OrientDB.
Being written in Java means that OrientDB runs on more platforms, including e.g. Raspberry PI. It also means that OrientDB can leverage a lot of other technologies written in Java, e.g. OrientDB has superb full text / geospatial search support via Lucene, which is not available to ArangoDB.
OrientDB uses a dialect of SQL as its query language, whereas ArangoDB uses its own custom language called AQL. In theory, AQL is better because it's designed explicitly for the problem, in practise though it feels quite similar to SQL but with different keywords, and is yet another language to learn while OrientDB's implementation feels a lot more comfortable if you're used to SQL. SQL is declarative whereas AQL is imperative - YMMV here.
ArangoDB is a "mostly-memory" database, it works best when most of your data fits in RAM. This may or may not be suitable for your needs. OrientDB doesn't have this restriction (but also loves RAM).
OrientDB is fully object oriented - it supports classes with properties and inheritance. This is exceptionally useful because it means that your database structure can map 1-1 to your application structure, with no need for ugly hacks like ActiveRecord. ArangoDB supports something fairly similar via models in Foxx, but it's more like an optional addon rather than a core part of how the database works.
ArangoDB offers a lot of flexibility via Foxx, but it has not been designed by people with strong server-side JS backgrounds and reinvents the wheel a lot of the time. Rather than leveraging frameworks like express for their request handling, they created their own clone of Sinatra, which of course makes it almost the same as express (express is also a Sinatra clone), but subtly different, and means that none of express's middleware or plugins can be reused. Similarly, they embed V8, but not libuv, which means they do not offer the same non blocking APIs as node.js and therefore users cannot be sure about whether a given npm module will work there. This means that non trivial applications cannot use ArangoDB as a replacement for the backend, which negates a lot of the potential usefulness of Foxx.
OrientDB supports first class property level and database level indices. You can query and insert into specific indexes directly for maximum efficiency. I've not seen support for this in ArangoDB.
OrientDB is the more established option, with many high profile users. ArangoDB is newer, less well known, but growing fast.
ArangoDB's documentation is excellent, and they offer official drivers for many different programming languages. OrientDB's documentation is not quite as good, and while there are drivers for most platforms, they're community powered and therefore not always kept up to date with bleeding edge OrientDB features.
If you're using Java (or a Java bridge), you can embed OrientDB directly within your application, as a library. This use case is not possible in ArangoDB.
OrientDB has the concept of users and roles, as well as Record Level Security. This may be a killer feature for you, it is for me. It also supports token based authentication, so it's possible to use OrientDB as your primary means of authorizing/authenticating users. OrientDB also has LDAP integration. In contrast, ArangoDB support only a very simple auth option.
Both systems have their own advantages, so choosing between them comes down to your own situation:
If you're building a small application, and you're a web developer optimizing for developer productivity, it will probably be easier to get up and running quickly with ArangoDB.
If you're building a larger application, which could potentially store many gigabytes or terabytes of data, or have many thousands of concurrent users, or have "enterprise" use cases, or need fine grained security controls, OrientDB is the one for you.
If you're storing RDF or similarly structured linked data, choose OrientDB.
If you're using Java, just choose OrientDB.
Note: This is (my opinion of) the state of play today, things change quickly and I would not underestimate the ruthless efficiency of the awesome team behind ArangoDB, I just think that it's not quite there yet :)
Charles Pick (codemix.com)

Development Platforms for Financial modeling (What do the Quants use?)

Quantitative Analysts or "Quants" predict the behavior of markets to maximize profits. I am interested in the software that they use to accomplish this. Are there development platforms, libraries, languages or Data Mining suites specifically tailored to Financial Modeling?
Statistical Modeling:
First, there are statistical computing languages like R which is powerful and open-source, with lots of packages for analysis and plotting.
You will find some R packages that relate to finance:
http://www.quantmod.com/
https://www.rmetrics.org/
https://www.rmetrics.org/ebooks-tseries
Machine Learning and AI to train the system on past data:
Weka Data Minig: http://www.cs.waikato.ac.nz/ml/weka/
libsvm (data classifiers http://www.csie.ntu.edu.tw/~cjlin/libsvm/)
"Artificial Intelligence: Modern Approach" book (code: http://aima.cs.berkeley.edu/code.html)
Backtesting the trading system on past data:
More often that not, broker trading platforms will provide facilities for trading automation, in form of scripts and languages with which you can program the logic of the trading "strategy" (some use common languages like Java, some use proprietary ones). They will also provide some minimal support to test the strategy on past data, and get a detailed report on the taken trades and their outcome.
Connection to broker and System Testing:
Either you use some broker-proprietrary trading API, or go with the more standardized FIX.
Building a FIX server that does a quotation ticks playback to your trading system (which in this case will be a FIX client) is also a very good form of validation of the system. Most reputable ECNs will provide FIX access. So this is more portable than any other interface.
QuickFIX/J is a full featured
messaging engine for the FIX protocol.
It is a 100% Java open source
implementation of the popular C++
QuickFIX engine.
http://www.quickfixj.org/
There aren't any full blown platforms/applications per-se, since pretty much all software in this field is developed in-house, and usually behind the firewall (obviously for competitive advantage; in a fiercely competitive industry)
A well known library that includes a lot of algorithms and pricing models, and makes for a suitable starting point for a framework or app is called quantlib.
The Strata project from OpenGamma provides a comprehensive open source Java library for market risk, including all the basic elements a quant would need to manage things like holidays, trades, valuation and risk measures. Disclaimer, I am an author.

Which programming language Google app engine is most likely to work with next and why?

Their roadmap says their next release will be in March 2009, and that they'll be adding a new 'runtime language'. I'm hoping its either Java or PHP but realy not sure, and would like to know which language is the most probable so i can plan accordingly for a project I plan on hosting with google app engine.
Any ideas?
I'd say Java, if only for the reason Android (or, at least, the SDK) is written in Java and they went to the trouble of writing their own interpreter/VM.
If not Java, then Ruby would be my guess. Not sure why, but it feels like a good fit.
I would say that you have to look at a few factors:
The language needs to:
be sandboxable
be controllable
be expandable
be different from python
appeal to people who want to write massively scalable applications
can be run on developer computers easily
run on Linux
Sandboxable
The language must be safe to run on Google servers. Portions of the language/VM/modules|libraries must be able to be disabled and/or replaced.
Controllable
Notice how Google uses languages that are not controlled by companies?
Python's BDFL GvR works for Google.
Dunno about Javascript.
Java is open-sourced enough for their taste I suppose.
So the language evolution must allow Google's input at the very least.
Expandable
Google needs to be able to add stuff to the language, and that nearly implies an open-source language. I don't think they are interested in doing an internal fork of an existing language.
Different from Python
Python is mature, easy to learn, and powerful. The new language would have to have significant differences with python, otherwise, why not just use Python. Maybe a very functional language?
Appeal to massive scalability
Execution time would not be necessarily critical, but the language must be able to support easy start and stop, easy provisioning to other servers, and appeal to the sort of people who are into writing massively scalable applications.
Developer computers
The language needs to be able to be easy to install, maintain, and develop for on Windows, Mac, and Linux. It has to be either fully manageable with text editors or already have rock solid tools for editing and managing on these platforms.
Linux
Google servers would run the programs, so these must be able to be safely transferred on google servers and run there, and must be able to be controllable by the Google App Engine load-balancer, so they need to be unixy.
Brainstorming
I don't think it will be Java (too heavy, hard to modify VM), php (too leaky), ruby (hard to modify VM), C++ (can't be sandboxed(that I know of)). I don't think it would be JavaScript either, because it's hard to modularize, and it's not an easy language to learn. That rules out Lisp as well--the hard-to-learn part.
So something else.
Remember though that they want adoption of the tool, and they need a language that would be adoptable by a lot of people and a lot of businesses.
So I lean to C# with mono. I think that makes the most sense. I know it sounds scary but lately the developers of the language are looking at changing C# quite a bit, to incorporate python-like dynamic typing, that sort of thing.
Conclusion
So that's what I think. And if they can pull that off, they will be able to leapfrog the competition. Mono is under MIT X11 license (as of April 2008), and I guess Miguel de Icaza can be hired by Google in the future, along with key team members.
So my prediction is C#.
Languages used for production code inside Google are limited to C++, Java, Python, and JavaScript.
Apps Engine already runs Python, so what's next?
It's most likely JavaScript. I recall Steve Yegge working on a Rails equivalent for JavaScript. See Stevey's Blog Rants: Rhino on Rails.
Java is less likely, but possible. Java servlet containers tend to be heavy-weight.
C++ is possible (Native Client and Chrome are two examples of sandboxed C++ code), but unlikely at this point.
I would say Java too, so they can support Ruby with JRuby, compatible with Python with Jython, Groovy and so on.
My guess is C# just to stick it to Microsoft.
Yup, JavaScript.
Why?
First, it fits. While there are obvious architectural differences (notably the OOP system) between Python and JavaScript, they are closer than they are farther apart, so converting the GAE Python API to A JS API should not be a dramatic leap in design or implementation. In the end, the JS API will likely have much the same flavor of the Python API.
Second, safety. The JS runtime idiom is identical to the Python idiom in that effectively you're going to have JS processes running independently from each other for each request. That is, the classic Apache forking model.
As a hosting service, this model is extremely robust and much, much easier to control than something like Java. What you lose in efficiency via a threaded implementation, you gain by simply being Google with a gazillion machines. At Googles scale, administrative overhead trumps performance every day of the week. Simpler and more robust is better, and that's what the process model is.
Third, technology speed. JS is moving VERY quickly right now. Look at the larger number of commercial enterprises writing JS interpreter/compiler/runtimes, as well as the advancements of the language itself. JS script has rushed to the front with a vengeance.
Finally, popularity.
While not popular on the server side, JS is still likely the most deployed language in the world, and thereby the most accessible language in the world. Every hack web designer on the planet is becoming a JS programmer, whether they like it or not.
Now, I don't know how many web designers you've met, but most of the ones I have met are NOT programmers. So, adopting JS for them is going to be a cut and paste and painful experience for them, but it's pretty much a requirement for the modern web. Taking that skill to push back and do some lightweight processing on the back end, in the SAME LANGUAGE, will be a boon to these people. Do not discount the power of familiarity in a normally scary environment (and despite the advances, computers are still "scary" to the vast majority of the population).
JS, it's not a toy any more, it's a sleeping giant. Really.
JRuby on Rails.
Already works with Python. There have been rumors about PHP, which is logical choice considering it's popularity.
I'm going to throw in my 2 cents on Java as well. They have a heavy number of tools already written in Java (GWT anyone? etc. etc.)
Though, Javascript would be most intriguing.
I`ve heard once that Google likes Python the most!

Resources