Graph UI ArangoDB - graph-databases

I'm using ArangoDB but struggling with its Graph Viewer UI. First, it is initialized at every save and, moreover, does not allow for step-by-step graph exploration.
Are there any tools to deal with graph DB in a "visual" way?

Related

How can I get realtime streaming updates to a Gremlin query?

I fell in love with realtime streaming updates to a query when I was using Firebase, RethinkDB and similar. Now that I am working with graph databases via Gremlin, I'm wondering how to get this behavior.
As a trivial example, if I specified a gremlin query like:
g.V().values('name')
I'd like to receive an update when a new vertex is added with a name property, or a name is changed on an existing vertex.
I am beginning to use Janusgraph, so the ideal solution would work there -- but this is such a killer feature that I could be swayed to other Gremlin-friendly graph databases.
Thanks!
You could use an EventStrategy with any Tinkerpop compatible graph database. Once you create the event strategy, you add it to your traversal g = graph.traversal().withStrategies(strategy). You'll need to implement the MutationListener interface to do whatever you'd like to on those events.
OrientDB has LiveQuery though I don't know that it integrates with Gremlin - https://orientdb.com/docs/2.1/Live-Query.html - that's the closest thing I know of to this kind of feature in any TinkerPop-enabled graph database

Why there are not so many graph databases as graph processing frameworks?

For graph databases, especially those are active and distributed, I knew some but not a lot. Like orientdb, Titan, Dex, etc.
Regarding the graph processing frameworks, there are huge set of tools like graphx, graph lab, powergraph, xstream, pregel, etc. and there are more coming out every year.
Can any one tell me the difference between those two categories of tools? Are they exchangeable? And why graph databases are not drawing enough attention as graph processing frameworks?
The difference between graph databases and graph processing frameworks is databases are built to save data in the basic form of a graph, where relationships between the data are built with edges and the data points are built with nodes/vertices. Some databases, like OrientDB extend this basic concept considerably, to make the database much more versatile. Others are less versatile. Though in general, the main goal is to persist the data an a graph-like form, edges and vertices.
With graph processing frameworks, on the other hand, they take a set of data and build analytical graphs out of the data. The goal is mainly analysis of graph like patterns or structures within the data.
I'll try to put this in an analogy, as I understand it.
Say you have a punch bowl full of punch (your data).
In a graph database scenario, the punch is already a graph and you can look into the bowl and see all the stuff in your graph and analyze it too.
With a graph processing framework, you have a punch bowl full of stuff too, but it is murky and you don't see any graphs in it directly. To get a graph of some type, you first have to ladle out some of the punch, in let's say, a "graph processing ladle". This allows you to see some kind of graph coherence, depending on the algorithms you choose to try and analyze the data with. Of course, depending on your machine or system, like Spark, the graph processing ladle could be huge, even just as big as your whole punch bowl or even bigger.
Still, it takes time and processing to make a "sensible graph" out of the punch (your data). The other thing about this is, if you want to store this newly found ladle of analyzed graph punch, you'd have to have another bowl to put it in. And, if you drop the ladle on the floor, your graph data is gone. This wouldn't happen with a graph database.
I hope that makes sense.
Scott
There are connections and isolations between the graph database and graph computing.
Connections:
Graph database will not only offer data storage but also a series of graph data processing, For example, to find solve the SSSP problem needs traversal and computation of the graph which must be supported by the graph processing framework.
Isolations:
You can't use the graph database for most of the graph computing like PageRank, Greedy Graph Coloring, because as a basic storage and query system, graph database doesn't need to have the ability to do computing jobs.
Correct me if I'm wrong, I'm also a freshman for graph computing.

Which graphdb to use for user's database. (Neo4j or FB Graph API)

I need to setup a question & answer type of webapp (quite similar to stackoverflow, but at a much smaller scale) for which i need to maintain a user database also, since users will have the ownership of questions and answers.
Which kind of datastore should i use? I'm working in Google Appengine, so pls suggest things which are easily compatible and integrable with GAE.
I thought of using Graph database like Neo4j or Facebook graph db.
Also if anyone has used fb graph db, can you tell me how exactly to use it and if it will be compatible with a GAE application. And will i still need a db to store the information from fb graph api in a seperate db or that can happen on the go.?
Neo4j works well for this kind of application, see here for a possible model:
http://blog.brian-underwood.codes/2015/02/16/making_master_data_management_fun_with_neo4j_-_part_1/

Is there any graph database good for both updating graph and data mining?

I am new to graph database and try to find the right one for us but I haven't. We need something good for both updating graph and data mining.
For graph database like Neo4j we could perform queries and updates really fast. And it will perform very fast when dealing with highly connected data. But it seems not very useful to perform computations on the whole graph. That is what we need for data mining(to run pagerank for example). And GraphLab, Giraph, GraphX, Faunus etc are of this kind. But many of them are not good at like even removing and updating the graph. For example deleting vertices and edges cannot be done explicitly in GraphLab.
Is there anything good for both updating graph and pageranking?
Titan is built with both OLTP and OLAP processing in mind. It is therefore good at both high-speed read/writes at large scales:
http://thinkaurelius.com/2013/05/13/educating-the-planet-with-pearson/
http://thinkaurelius.com/2013/11/24/boutique-graph-data-with-titan/
You mentioned Faunus as something you looked at for graph analytics. Faunus is highly tuned to work with Titan. In fact, as of the most recent release of Titan at 0.5.0, Faunus has been repackaged and distributed directly with Titan as titan-hadoop for even greater integration and support.
Looking forward to Titan 1.0 due in coming months, Titan will look to support TinkerPop3, which does for OLAP what original versions did for OLTP, in that it generalizes graph analytics frameworks (it already integrates Giraph as the reference implementation).
Since you are in the exploring stage and familiar with Neo4j, I think looking at TinkerPop3 documentation would be a good start as it uses Neo4j for its reference implementation. Numerous vendors are preparing to support this latest version of TinkerPop and thus developing your application against TinkerPop, lets you get started without having to be tied to a particular graph database in this early stage. You can save that decision for later once you have more time to evaluate the different implementations available.
If you need to get to work with something right away, then start with Titan 0.5 and consider your migration path to 1.0 when it becomes available.

How are databases used to implement document collaboration?

How are document collaboration tools such as Google Docs and Sharepoint implemented in the backend? What kind of database architecture in the backend is used to implement features such as multiple people editting the document simultaneously. How is this done efficiently efficiently for large documents without having each edit update an entire database entry?
And how do they maintain the complete version history of every single edit while not using up tons of disk space?
Do Google Docs and Sharepoint have degrading performance for very very large documents?

Resources