What factors to consider when choosing a Multi-model DBMS? (OrientDB vs ArangoDB) - database

I am looking to dip my hands into the world of Multi-Model DBMS, I have no particular use cases, just want to start learning.
I find that there are two prominent ones - OrientDB vs ArangoDB, but was unable to find any meaningful comparison, unopinionated between them. Can someone shed some light on the difference in features between the two, and any caveats in using one over the other? If I learn one would I be able to easily transition to the other?
(I tagged FoundationDB as well, but it is proprietary and I probably won't consider it)
This question asks for a general comparison between OrientDB vs ArangoDB for someone looking to learn about Multi-model DBMS, and not an opinionated answer about which is better.

Disclaimer: I would no longer recommend OrientDB, see my comments below.
I can provide a slightly less biased opinion, having used both ArangoDB and OrientDB. It's still biased as I'm the author of OrientDB's node.js driver - oriento but I don't have a vested interest in either company or product, I've just necessarily used OrientDB more.
ArangoDB and OrientDB are both targeting a similar market and have a lot of similarities:
Both are multi-model, you can use them to store documents, graphs and simple key / values.
Both have support for Gremlin, but it's firmly a second class citizen compared to their own preferred query languages.
Both support server-side "stored procedures" in JavaScript. In both systems this comes via a slightly less than idiomatic JavaScript API, although ArangoDB's is a lot better. This is getting fixed in a forthcoming version of OrientDB.
Both offer REST APIs, both aim to be usable as an "API Server" via JavaScript request handlers. This is a lot more practical in ArangoDB than OrientDB.
Both are distributed under a permissive license.
Both are ACID and have transaction support, but in both the transactions are server-side operations - they're more like atomic batches of commands rather than the kinds of transactions you might be used to in a traditional RDBMS.
However, there are a lot of differences:
ArangoDB has no concept of "links", which are a very useful feature in OrientDB. They allow unidirectional relationships (just like a hyperlink on the web), without the overhead of edges.
ArangoDB is written in C++ (and JavaScript), whereas OrientDB is written in Java. Both have their advantages:
Being written in C++ means ArangoDB uses V8, the same high performance JavaScript engine that powers node.js and Google Chrome. Whereas being written in Java means OrientDB uses Nashorn, which is still fast but not the fastest. This means that ArangoDB can offer a greater level of compatibility with the node.js ecosystem compared to OrientDB.
Being written in Java means that OrientDB runs on more platforms, including e.g. Raspberry PI. It also means that OrientDB can leverage a lot of other technologies written in Java, e.g. OrientDB has superb full text / geospatial search support via Lucene, which is not available to ArangoDB.
OrientDB uses a dialect of SQL as its query language, whereas ArangoDB uses its own custom language called AQL. In theory, AQL is better because it's designed explicitly for the problem, in practise though it feels quite similar to SQL but with different keywords, and is yet another language to learn while OrientDB's implementation feels a lot more comfortable if you're used to SQL. SQL is declarative whereas AQL is imperative - YMMV here.
ArangoDB is a "mostly-memory" database, it works best when most of your data fits in RAM. This may or may not be suitable for your needs. OrientDB doesn't have this restriction (but also loves RAM).
OrientDB is fully object oriented - it supports classes with properties and inheritance. This is exceptionally useful because it means that your database structure can map 1-1 to your application structure, with no need for ugly hacks like ActiveRecord. ArangoDB supports something fairly similar via models in Foxx, but it's more like an optional addon rather than a core part of how the database works.
ArangoDB offers a lot of flexibility via Foxx, but it has not been designed by people with strong server-side JS backgrounds and reinvents the wheel a lot of the time. Rather than leveraging frameworks like express for their request handling, they created their own clone of Sinatra, which of course makes it almost the same as express (express is also a Sinatra clone), but subtly different, and means that none of express's middleware or plugins can be reused. Similarly, they embed V8, but not libuv, which means they do not offer the same non blocking APIs as node.js and therefore users cannot be sure about whether a given npm module will work there. This means that non trivial applications cannot use ArangoDB as a replacement for the backend, which negates a lot of the potential usefulness of Foxx.
OrientDB supports first class property level and database level indices. You can query and insert into specific indexes directly for maximum efficiency. I've not seen support for this in ArangoDB.
OrientDB is the more established option, with many high profile users. ArangoDB is newer, less well known, but growing fast.
ArangoDB's documentation is excellent, and they offer official drivers for many different programming languages. OrientDB's documentation is not quite as good, and while there are drivers for most platforms, they're community powered and therefore not always kept up to date with bleeding edge OrientDB features.
If you're using Java (or a Java bridge), you can embed OrientDB directly within your application, as a library. This use case is not possible in ArangoDB.
OrientDB has the concept of users and roles, as well as Record Level Security. This may be a killer feature for you, it is for me. It also supports token based authentication, so it's possible to use OrientDB as your primary means of authorizing/authenticating users. OrientDB also has LDAP integration. In contrast, ArangoDB support only a very simple auth option.
Both systems have their own advantages, so choosing between them comes down to your own situation:
If you're building a small application, and you're a web developer optimizing for developer productivity, it will probably be easier to get up and running quickly with ArangoDB.
If you're building a larger application, which could potentially store many gigabytes or terabytes of data, or have many thousands of concurrent users, or have "enterprise" use cases, or need fine grained security controls, OrientDB is the one for you.
If you're storing RDF or similarly structured linked data, choose OrientDB.
If you're using Java, just choose OrientDB.
Note: This is (my opinion of) the state of play today, things change quickly and I would not underestimate the ruthless efficiency of the awesome team behind ArangoDB, I just think that it's not quite there yet :)
Charles Pick (codemix.com)

Related

What is a good lightweight ORM for my need using Kotlin?

Scenario :
I am having an application where I am using AWS Lambdas which are written in Kotlin to query data from a relational DB residing in AWS.
--
My problem is that I want to use an ORM for firing these queries. I dont want to use hibernate as it is too heavy and takes too long to setup, and I need a solution that would take up the least time in setting up and firing from the Lambdas. I have looked upon multiple ORMSs like Exposed, Requery, Jooq, Ktorm and Squash.
Is anybody out there having experience with any of these libraries in the serverless context? What are your experiences with them and what would you suggest using in my scenario?
You can have a look at exposed, https://github.com/JetBrains/Exposed
I have been using Squash with the Hikari connection pool for some large projects and I have been very happy with it. I like that is is very extensible and my team has been able to solve any issues that come up, implementing extensions to the dialect and the simplicity of defining TableDefinition classes makes it work well for generating code. It is also very self contained with very few dependencies and light on reflection, so should be good for serverless though I have not personally used it for that.
Squash is less an ORM than an sql abstraction / translation layer that ties into entities and it doesn't try to solve all the problems that something like hibernate does. In my experience ORMs start as simple, efficient, and powerful projects and grow to heavyweight libraries that try to do too much and their complexity begins to cause issues when the developer cannot easily see what's going on in the chain from usage through to the database / storage mechanism.
One negative about squash that deserves mention is that, while it is a jetlbrains official library and created by a kotlin developer, support is limited as orangy, the creator, is quite busy and I have feature pull requests outstanding, with many more of them backed up currently. I chose it because I favored it's simplicity and extensibility among a small but advanced team of developers all capable of improving upon it.
Which ever library you choose I hope these factors assist you in making your decision at the least.

When would you use an enterprise service bus and an integration framework like Apache Camel?

I was trying to get a fix on when using Apache Camel would be appropriate and inappropriate from reading this article -- https://dzone.com/articles/when-use-apache-camel . What the article mentions is that when the number of services is low, using an integration framework like Camel might be overkill, which makes sense. But I was confused by this sentence
Although FuseSource offers commercial support, I would not use Apache
Camel for very large integration projects. An ESB is the right tool
for this job in most cases. It offers many additional features such as
BPM or BAM. Of course, you could also use several single frameworks or
products and „create“ your own ESB, but this is a waste of time and
money (in my opinion).
Is this because the integration framework lacks components that an ESB provides? If so, what are those?
At a functional level, Apache Camel does everything that all the other ESB's do and is a fine choice for pretty much any integration work. It has connectors (components in camel-speak) for every transport you can think of, deals with clustering, can provide a JMS broker and whatever else you need to integrate.
It doesn't have the nice UI's and IDEs that other tools (Tibco, webMethods, Boomi etc) which is a big advantage. The developers might actually write unit tests if you use Camel :) I'm joking of course, integration devs never write unit tests.
In terms of "weight" Camel itself is not too bloated. It can be used as a standalone runtime or you can simply leverage the integration capabilities as a library in another app. It heavily utilises spring and can run a large number of threads, so requires a reasonable amount of memory (~512Mb JVM Heap would be the lower bound) but is not difficult to use. It wont fight you.
JBoss Fuse is the full blown Red Hat supported Enterprise ESB. It is based around Camel, but uses apache karaf as the runtime which is a OSGi container. This is heavy weight but gives you a very powerful ESB runtime, deployment model and management interface and you can buy commercial support for Fuse and ActiveMQ from Red Hat. This is the more traditional "ESB" platform, but deep down inside all the integration functionality comes from Camel.
I have been working with various products that are used to implement ESBs in the past 10 years: Apache Camel, Mule, Oracle Service Bus, IBM Integration Bus (WebSphere Message Broker), Spring Integration, FUSE, etc. I can tell you what are the differences from various perspectives, but most importantly, what you have to understand is that ESB is an abstract concept - basically a "bus" in terms of integration patterns that connects various services in an entreprise. For this reason, you can build an ESB with a product that targets this specific need (e.g. IIB) or you can build it with something much more generic (e.g. SpringIntegration).
Management point of view:
The top-end commercial products (the ones you refer to "ESBs") like the ones from Oracle or IBM are well suited for this job. They have nice user interfaces that are made for implementing a bus pattern. On the workforce market you will need to find (may be simple or not) certified specialists to operate these products. Behind these products you will get support of big companies with which the enterprise may already have existing contracts. Often these products will have very specific connectors that may (or not, see developer point of view) ease your life. To give you an example, IBM's product will have very efficient connectors for IBM MQ or Mainframe CICS. They also offer out-of-the box integrations with other products like BPM. When building bigger implementations of ESBs, the projects are structured by the fact that not everything can be done in such a product - this means that you have smaller risk but also less flexibility. In terms of problems management, you are relatively safe because you blame the big company that supports you.
Towards the more "open" products/frameworks like FUSE, Mule, Camel, etc., you will start getting very flexible software that may be also cheaper as a start price. For working with it you will need generic profiles, but you will also need software architects to design your product because nothing forces you build the message flows in a particular way. At a certain point you will need at least some very skilled developers because you may not get efficient support, depending on the product you choose. In terms of problems management, you are fully responsible for this choice.
Developer point of view:
Top-end commercial products will come with in-built connectors and other operators - quick to build a POC. However, any customization will be a pain (for instance you may have an HTTP connector, but you may need to switch to some new SHA algorithm that has not been standard at the time you installed). The introduction of specific custom connectors will be possible, but much more complicated than with "open" products. You'll get nice user interfaces that will hide the code. This will mean a series of drawbacks for you, for instance code repository usage will not be efficient (can't diff two versions), unit tests are difficult to write, etc. The integrations with other products like BPM will force you to use a pre-chosen product, anything else will be doable but expensive.
The open-end products/frameworks will be flexible, everybody in the team will understand the product quickly (e.g. provided everyone knows Spring). Changing some core functionality might be very cheap. However, things can quickly degenerate if there is no supervision of experienced members because you can code pretty much anythig.
Architect point of view:
That said, there is something about the ESB that you should know in advance. You should be very clear in terms of capabilities of what you need to do (let's say: routing, service discovery, protocol conversion, etc.). Should you need to implement something complex in an ESB, for instance transformation of the business message, means that the business analyst will need to interact with your ESB. So for instance the expert in banking payments will need to write an XSLT for you. This XSLT might be simple, but it may also cause big trouble, for which you will be responsible. Now if this person also needs to know the product of your choice, this complicates things. Therefore the added value of integrating an ESB with BPM, BAM, etc. is very questionable.
Another thing to note is that for most enterprises nowadays, the ESB is not something that would create some competitive advantage, like for instance a web page. So investing in a super complex development or out-of-the-box product needs to be challenged.
Conclusion
Not sure if it was overwhelming or understandable, but in the end it boils down to the resources you have and the complexity of the task.
Just as a comment, contrary to another answer on this question, if you are building something complex:
Pay attention to the Architect's point of view above.
You may seriously consider the open end of products because by experience I can tell you that no matter what kind of support you have, when the things get nasty you better have very skilled people in-house and no official support will really save you (apart from your face).
A recent article by codecentric states as follows:
Apache Camel is a framework full of tools for routing data within an application. It is a framework you use when a full-blown Enterprise Server Bus is not (yet) needed.
You can find the full article here
Don't get too stuck in words like ESB and frameworks. The things that should decide on what kind of platform you need should depend on:
Your current and future system landscape
Your current and future requirements (technical and non-functional)
Your in house competence
Your budget
After going through those steps then you can evaluate and come to a conclusion on what you need .For some apache camel is the best fit for some other platforms may be a better fit. Don't get stuck in words like ESB etc.

What is difference between Titan and Neo4j graph database?

I had worked on relational database; but now want to learn about graph database. I came to know that these two are graph database. What is difference between these two databases. What should we prefer among them?
One approach is to simply try to choose one database over the other. For example, you might quickly search around to find that Titan has been forked to JanusGraph where it is more actively maintained. In your research you may find that there are other open source graph databases as well like OrientDb, ChronoGraph, or Sqlg as well as commercial alternatives like Microsoft's CosmosDb, DSE Graph or IBM Graph. How do you decide now?
There is a graph framework that ties together all of these graphs including Neo4j/Titan (and more than those listed here): Apache TinkerPop. TinkerPop provides an abstraction over different graph databases and graph processors allowing the same code to be used with different configurable backends. This pattern is quite similar to the one you find in SQL with JDBC which helps make your code vendor agnostic.
You can try all of the different supported graph databases before you make a choice and you can do this type of prototyping/benchmarking fairly quickly with the Gremlin Console. You will be able to make self-informed choice as to what is the best way to go for your project.
It occurs to me as I come to the end of this post that I haven't directly answered your question. If you are just getting started and are just interested in learning about graph databases, then I likely wouldn't recommend starting with Titan/JanusGraph as it requires a bit of configuration to get started (schemas, backend selection, etc). Start with TinkerGraph or Neo4j using the Gremlin Console to try out some simple graph traversals and go from there.
Titan was originally backed by Aurelius, which was bought by DataStax in 2015. This move was designed to give DataStax a jump-start into the Graph DB world, as they now offer their own "DSE Graph" enterprise product. Titan was since been forked (as previously mentioned) into JanusGraph.
The nice thing about Titan/Janus (IMO) is that it is "pluggable" with other existing back-end and search technologies. So it will "play nice" with things like Cassandra, HBase, Hadoop, Solr, and ElasticSearch.
The drawback is that the community support is tough. The Titan project has been effectively killed, and Janus scores a whopping 0.23 on DBEngines. That makes it the 16th most-popular Graph DB (231st overall), which is pretty low.
Neo4j is backed by Neo Technology, and is regarded as the front-runner in the Graph DB community (score of 38.52 right now, 1st graph DB and 21st overall). It is open source, but controlled by Neo Technologies so they can dictate a difference in feature set between open source and enterprise.
The nice thing about Neo4j is that they have a lot of tutorials and learning aids built right-in to the Neo4j Browser, which is a nice, user-friendly web interface. Their documentation is top-notch, easy to read and search through, and they have a pretty good following here on Stack Overflow.
Neo4j Browser screenshot:
The drawback of Neo4j, is that some features (like clustering) are only available in the enterprise version. But if you work for a big company who doesn't mind shelling-out $ for an enterprise license, that may not be a big deal.
Consistency: Titan/Janus is a part of the "eventual consistency" crowd, while Neo4j aims to be strong-consistent (especially in a causal clustering scenario). Although consistency can be tuned with configuration in both, with Titan/Janus that can be dependent on your choice of pluggable backend (ex: typically strong-consistent with HBase, while eventually consistent with Cassandra).
Recommendation:
If you're just starting to learn graph databases and modeling, you can't go wrong with Neo4j. Simply download/install the community edition, run it, and execute :play movies as your first command (tutorial that walks you through loading, modeling, and querying movie relationships).
If you have some experience with graph, and you don't mind troubleshooting/googling to figure out things (like how to set the max frame size for Thrift), then you could probably do some really cool things with Titan.
Try each out, and see which one works for you.
There are far more than two graph databases - there are dozens. That being said, there are two with real market share: Neo4j and Titan/JanusGraph. But there are dozens of other graph datases, each with interesting strengths for different specific application spaces. That being said, I wouldn't dig into all of the niche players to start with - learning the basic idea of graph databases can be done with one of the two lead players.
Neo4j is the most mature, with the most nicely packaged install and documentation, tons of reference code, and support from a wide range of partners.
Titan/JanusGraph is the next most popular, as it's free/open source and has very strong support (e.g. IBM, Google, Hortonworks, AWS, ...). There's a recent complexity in that the leaders of the Titan project were acquired, freezing the Titan project. But the community forked the project into JanusGraph. So while JanusGraph is a new project, it's literally the same Titan code, with even broader industry support than Titan had.
Related to the two is the language used to work with the graphs. Neo4j uses its proprietary language, Cypher, while nearly everyone else uses Gremlin, and the TinkerPop open source tool set (which is a part of the Apache set of open source projects). Nearly all graph databases, including Neo4j, support Gremlin and TinkerPop. So, for example, you can use either Cypher or Gremlin to query Neo4j, though Neo (and some other proprietary graph database vendors) support Gremlin as a second-class citizen, so to speak. For example, you can connect to Neo using Gremlin from the (external) Gremlin console, but you can't use Gremlin in the (very nice) Neo4j console.
Note that there are many graph databases that support Gremlin other than Titan/JanusGraph. One new entrant that's very interesting is Microsoft's Azure Cosmos DB, which is a managed graph database that's "cheap and easy" if you use Azure already. And there are several vendors that provide managed JanusGraph.
For personal learningk I'd say that Neo4j is the easiest to set up and learn - you download and run it, and open a web browser onto their web-based console, which only takes a few minutes. That being said, if you're comfortable on a command line JanusGraph only took a half hour to install and get running for me, so it's not too hard.
For learning the concepts Neo4j is great. Neo4j's query language, Cypher, and JanusGraph's query language, Gremlin, are semantically identical, just spelled differently, so you'll learn the concepts either way.
For building a real system, either could work (and there are many successful following both approaches).
For which you choose, you'll want to think about whether you want to be strategically tied to a single vendor (Neo4j) or in a broader standards-based community. There's comfort level in picking the market leader with the most mature product - Neo4j. And there's a comfort level in picking open standards with strong industry support - JanusGraph. So IMO there's no "wrong" answer - people using either one are happy and successful. But since you have to pick, you'll need to think about which you're more comfortable with long-term.
Neo4j uses native graph technology.
Native graph technology ensures that data is stored efficiently by writing nodes and relationships close to each other.
It optimizes the graph DB.
With native graph technology, processing becomes faster because it uses index-free
adjancey. That means each node directly references its adjacent nodes.
Titan (Now JanusGraph) uses non-native graph technology.
In non-native we use different storage backends like Cassandra, HBase
With non-native processing becomes slowers compared to native because database uses
many types of indexs to link nodes together.

Backend for Web Development using Clojure/ClojureScript

I'm familiar with developing desktop apps in Clojure (written a multithreaded interactive visualization system). However, I'm fairly new to Web development using Clojure.
I plan to use Clojure on the server for handling logic; and ClojureScript for handing client side work. However, I don't know what to use for my database server. Should I use something like Monogodb? or Hadoop? Or .... ?
The app is something very simple; a basic forum. Total number of concurrent users will be < 100 at a given time. One thing that is important to me is the ability to easily backup / data consistency -- it's very very important to me that I can easily make daily backups (and not lose all the data.)
Thanks!
You can use many databases; if the database has an API for Java, you should be good to go. MySQL, MongoDB, Postgres, Hadoop… and more.
For a nice overview of the webstack in Clojure, check out brehaut's article on the matter.
For getting up and running quickly with Clojure and ClojureScript, try ClojureScriptOne.
There are many ways to write what you want to write; if you're already familiar with Clojure, it shouldn't be too hard to get going.
Haven't used it myself, but Datomic ( http://datomic.com/ ) looks great for anyone coming from Clojure.
Datomic is an amazing database, and I'd highly recommend it. It has many features which set it apart from other database systems:
Like Clojure's data structures, it's persistent, meaning that by default, adding new facts to the database doesn't delete old facts, allowing you to query the state of the database at a previous point in time, enhancing audit-ability and assistance in debugging.
The underlying Entity Attribute Value (EAV/triple) data model (at least partly inspired by RDF & the Semantic Web), is extremely flexible, allowing you to express arbitrary graph structures and effortlessly deal with polymorphism.
The query language is flavor of Datalog, a sort of pattern matching based query language strictly more expressive than SQL and the like in that it can do recursive queries, making it particularly well suited for dealing with graph data/queries.
In addition to Datalog queries, there's a pull api, which let's you pull data out of the database more simply using a GraphQL like expression which specifies the shape of a document-like structure you'd like to pull out of the database. These queries can even be used from within the :find clause of a Datalog query.
You can use Clojure functions from within your queries.
The indexing system is very smart and more or less automatic, in stark contrast with the work that typically goes into tuning SQL databases for performance.
Transactions go through a different API/function call than queries, meaning that the number one security risk identified by OWASP (SQL injection) is literally impossible in Datomic.
The transactor/read-replica design makes it super easy to scale reads/queries, while keeping pressure off the transactor.
It's fun as hell.
One of the things worth pointing out here is that by embracing the EAV data model and datalog/pull queries, Datomic ends up having structural flexibility closer to that of a NoSQL database, while still being fundamentally relational, and even more expressive in it's relational queries than SQL.
It's amazing and you should absolutely give it a shot. It will melt your brain a little. In the good way.
It's also worth noting that it's popularity has inspired a number of successful open source projects, so the underlying approach is not going anywhere any time soon:
DataScript: In memory clj/cljs partial implementation
Datahike: Fork of DataScript which queries over on disk indices, meaning you don't have to keep everything in memory to query
Mentat: Mozilla project trying to make a Datomic-alike for a Mozilla project

What is the production ready NonSQL database?

With the rising of non-sql database usage in high traffic website, I'm interested to use it for my project. Now I've heard several names like Voldermort, MongoDB and CouchDB. But which are among these NonSQL database that is production ready? I've seen the download pages and it seems that none of them is production ready because is not version 1.0 yet. Is there any other names other than these 3 that is recommendable to be used in production?
What do you mean by production ready? As far as I know, all of them are being used on live systems.
You should make your choice based on how the features they provide fit your needs.
You can also add Tokyo Cabinet to the list as well as the mnesia database provided by the Erlang VM.
I think you need to start out from your project requirements to see what kind of database you really need. There are many non-relational DBMS:s out there and they differ a lot in what kind of problems they are good at solving. I think the article Should you go Beyond Relational Databases? by Martin Kleppmann is a good starting point for finding out what you need. There's also a lot of stackoverflow threads on similar topics, these are my favorites:
The Next-gen Databases
Non-Relational Database Design
When shouldn’t you use a relational
database?
Good reasons NOT to use a relational
database?
When you have narrowed down what you actually need you can take a deeper look into the alternatives to see which DBMS are production ready for your use case. Production readiness isn't a yes/no thing: people may successfully deploy some solution that for example lacks in tool support - in another project this could be a no-go.
As for version numbers different projects have a different take on this, so you can't just compare the version numbers. I'm involved in the graph database project Neo4j and even if it has been in production use for 5+ years by now we still haven't released a version 1.0 final yet.
I'm tempted to answer "use SIRA_PRISE".
It's definitely non-SQL.
And its current version is 1.2, meaning that someone like you must definitely assume it's "production-ready".
But perhaps I shouldn't be answering at all.
Nice article comparing rdbms with 'next gen' and listing some providers:
Is the Relational Database Doomed?
http://readwrite.com/2009/02/12/is-the-relational-database-doomed
I will suggest you to use Arangodb.
ArangoDB is a multi-model mostly-memory database with a flexible data model for documents and graphs. It is designed as a “general purpose database”, offering all the features you typically need for modern web applications.
ArangoDB is supposed to grow with the application—the project may start as a simple single-server prototype, nothing you couldn’t do with a relational database equally well. After some time, some geo-location features are needed and a shopping cart requires transactions. ArangoDB’s graph data model is useful for the recommendation system. The smartphone app needs a lean API to the back-end—this is where Foxx, ArangoDB’s integrated Javascript application framework, comes into play.
Another unique feature is ArangoDB’s query language AQL — it makes querying powerful and convenient. AQL enables you to describe complex filter conditions and joins in a readable format, much in the same way as SQL.
You can model your data in several ways:
in key/value pairs
as collections of documents
as graphs with nodes, edges, and properties for both
You can access data in ArangoDB:
using the general HTTP REST API via curl/wget, or your browser
via the ArangoDB shell (“arangosh”)
using a programming language specific client library
Server requirements for ArangoDB:
ArangoDB runs on Linux, OS X and Microsoft Windows.
It runs on 32bit and 64bit systems, though using a 32bit system will limit you to using only approximately 2 to 3 GB of data with ArangoDB.

Resources