Get result as a sub graph rather than vertice : Gremlin - orient-db - graph-databases

Hi : I am a neebie in Gremlin and Orient-db. I was playing around with Gremlin and orient db in java. I was able to query my vertices and edges using the available methods and get the results.I am using back("Alias-name") to get the result vertices of my query.
My question is can I get a graph of all the result vertices ( the related graph and the information of the result) since right now I am able to get the vertices but I want the entire "sub graph" information of my resultant vertices in the same query.
Any help is greatly appreciated.
Here is a code with generic example:
GremlinPipeline startPipe = pipe.has("friend-name", "friend-name")
.in("friend-depends").as("friend-depends")
.outE("resource-depends").inV()
.has("resource-name", "car")
.back("friend-depends");
v(Friend)[#15:13]
v(Friend)[#15:7]
v(Friend)[#15:12]
The results are right but I would like to get the results as
Friend#15:13{friend-name:Frank,Friend-type:Personal,in_depends:#17:10 (friend of friends),... with edges} v2
Thanks,
Sabari

Gremlin does not provide an explicit subgraph function (as of the unreleased 2.5.0). The only way to get a subgraph with Gremlin is to explicitly extract those graph elements yourself. My preference is to simply sideEffect the elements to an in-memory TinkerGraph. You can see an example here:
http://gremlindocs.com/#recipes/subgraphing
Given your amended question, you may use the the path step to get individual parts of the path as in:
gremlin> g = TinkerGraphFactory.createTinkerGraph();
==>tinkergraph[vertices:6 edges:6]
gremlin> g.v(1).outE.inV.path
==>[v[1], e[9][1-created->3], v[3]]
==>[v[1], e[7][1-knows->2], v[2]]
==>[v[1], e[8][1-knows->4], v[4]]
gremlin> g.v(1).outE.inV.has('age',T.gte,31).path
==>[v[1], e[8][1-knows->4], v[4]]
That looks a bit like what you are looking for. From there you could sideEffect to a subgraph. From there you could choose to not use back anymore:
gremlin> g.v(1).outE.inV.has('age',T.gte,31).path.sideEffect{println it}.collect{it.last()}
[v[1], e[8][1-knows->4], v[4]]
==>v[4]
or stick with it:
gremlin> g.v(1).as('x').outE.inV.has('age',T.gte,31).path.sideEffect{println it}.back('x')
[v[1], e[8][1-knows->4], v[4]]
==>v[1]

Related

when selecting vertex using has('prop', 'value') with injected 'value'

Is it possible to select vertex properties using injected values?
I can't use lambda that provided in the official doc: http://tinkerpop.apache.org/docs/current/reference/#inject-step since the lambda is not supported in our case.
I tried doing
g.inject('vadas').as('a').V().has('name', select('a'))
but it's returning all the vertices that have attribute 'name', seems like it's not selecting the injected value
Are there any ways I can do the 'has' filter based on some injected values?
Your query is supposed to return all vertices that have a name property, as select('a') should always return a value.
There is no way to use injected values and at the same time benefit from an index lookup. The only thing you can do is a full vertex scan:
gremlin> g.inject('vadas').as('a').V().where(eq('a')).by('name').by()
==>v[2]
That works well on a small in-memory graph, but it surely isn't a scalable solution.
UPDATE
If nested select()'s are not available, you could still do something like this:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> data = ["marko": ["title": "dr", "age": 40]]
==>marko={title=dr, age=40}
gremlin> g.V().has("person","name",within(data.keySet())).as("v").
flatMap(constant(data).unfold().
where(eq("v")).
by(keys).
by("name").
select(values).
unfold()).as("kv").
select("v").
property(select("kv").by(keys), select("kv").by(values)).
iterate()
gremlin>
gremlin> g.V().valueMap()
==>[name:[marko],title:[dr],age:[40]]
...
There is a full scan in this traversal, but only over the data memory structure. Hence, if data isn't crazy large, the traversal performance should/could be acceptable.
#Daniel Kuppitz and #Huimin Yang, we have supported this feature, use property index scan, I think it will provide a useful ability
gremlin> g.inject('vadas').as('a').V().has('name', eq(select('a'))).profile()
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
InjectStep([vadas])#[a] 1 1 0.090 0.51
GraphDbGraphStep(vertex,[name.eq([SelectOneStep... 1 1 17.768 99.49
>TOTAL - - 17.859 -
gremlin> g.inject('vadas').as('a').V().has('name', eq(select('a')))
==>v[vadas]

using inject for long traversal in gremlin when dealing with optional fields

I'm building a long traversal to add hundreds of vertices in one query. I saw from the official website that the recommended way to do it is to inject the object list and add vertices there: http://tinkerpop.apache.org/docs/current/recipes/#long-traversals
However in my case there's a lot of objects that have optional fields, take the example in the official doc, there might be someone who doesn't have the 'age' property, or 'name' property, I can use choose to do something like this:
g.inject().unfold().as('a').addV().choose(select('a').select('age'), property('age', select('a').select('age')))
but the choose step in Neptune is not optimized and this adds too much latency for the query, are there any other solutions for this?
Already answered on the gremlin-users mailing list, but to close the loop, here it is again:
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin>
gremlin> data = [["name": "Huimin Yang"],
["name": "Daniel Kuppitz", "age": 37]]
==>[name:Huimin Yang]
==>[name:Daniel Kuppitz,age:37]
gremlin>
gremlin> g.inject(data).unfold().as("m").
addV("person").as("v").
select("m").unfold().as("kv").
select("v").
property(select("kv").by(keys), select("kv").by(values)).iterate()
gremlin>
gremlin> g.V().valueMap()
==>[name:[Huimin Yang]]
==>[name:[Daniel Kuppitz],age:[37]]

Gremlin Drop Multiple Vertices

I am trying to drop all vertices returned by a given Gremlin query. The goal is to delete all children vertices which are children of a specific vertex.
Here's an example:
gremlin> g.V('dcb26be6-8d39-ae81-6ef2-6f60d06bce10').emit().repeat(out())
==>v[dcb26be6-8d39-ae81-6ef2-6f60d06bce10]
==>v[16b26be6-8d37-e882-38c6-a56f39ee4259]
==>v[9cb26be6-8d3c-d61e-4ab4-6c6993e8be7a]
==>v[82b26be6-8d3a-c01a-3771-085c94d1780a]
==>v[00b26be6-8d3c-68d9-6871-702a1247a692]
==>v[d4b26be6-8d38-81ea-b75d-25bbf563f81e]
==>v[cab26be6-8d39-3611-76fa-f369eab9d50e]
This query returns all vertices that have outward facing edges connected to the parent dcb26be6-8d39-ae81-6ef2-6f60d06bce10 vertex. Is there an easy way to drop all of the vertices returned by this query?
Thanks
EDIT:
#stephan had a great response however if the children have edges pointing to eachother gremlin gets mad at me. Check this out:
gremlin> g.V('2ab26c9e-1bbb-73f6-4ee8-6cecc7e21ee1').emit().repeat(out()).fold().unfold()
==>v[2ab26c9e-1bbb-73f6-4ee8-6cecc7e21ee1]
==>v[0eb26c9e-1bbc-12f3-e074-d7328ee4984e]
**==>v[92b26c9e-1bbd-b59f-0b5f-d4c985b176b6]**
==>v[18b26c9e-1bbf-a96c-90d3-e50e61fe7267]
==>v[12b26c9e-1bc1-40ee-292d-2bc7b08dcb9e]
==>v[ccb26c9e-1bbc-a82a-532f-7fbdea87deb1]
==>v[42b26c9e-1bbd-5f1f-f3ad-6f6670ab16ee]
==>v[7ab26c9e-1bc1-e773-6995-18159d610b77]
==>v[3ab26c9e-1bbe-add8-2ab2-948d7c9c0021]
**==>v[2eb26c9e-1bbf-1657-e212-98d1dfff33cd]**
**==>v[92b26c9e-1bbd-b59f-0b5f-d4c985b176b6]**
==>v[8cb26c9e-1bc2-500b-ae27-370a0cc4d392]
==>v[42b26c9e-1bc0-b4b0-4d54-fc7f20ca71d4]
==>v[7ab26c9e-1bc1-e773-6995-18159d610b77]
==>v[3ab26c9e-1bbe-add8-2ab2-948d7c9c0021]
**==>v[2eb26c9e-1bbf-1657-e212-98d1dfff33cd]**
As you can see vertex 92b26c9e-1bbd-b59f-0b5f-d4c985b176b6 appears twice as a response to this query. So when I try to do g.V('2ab26c9e-1bbb-73f6-4ee8-6cecc7e21ee1'). emit(). repeat(out()). fold(). unfold()
Here's the response I get
gremlin> g.V('2ab26c9e-1bbb-73f6-4ee8-6cecc7e21ee1'). emit(). repeat(out()). fold(). unfold().drop()
{"requestId":"2def0086-d71f-42e4-9c5f-c692d07cc96a","detailedMessage":"The
vertex does not exist 92b26c9e-1bbd-b59f-0b5f-
d4c985b176b6","code":"ConstraintViolationException"}
Is there any way to remove duplicates from the initial query?
Maybe it's just the end of the day and my brain is fried, but how about:
g.V('dcb26be6-8d39-ae81-6ef2-6f60d06bce10').
store('a').
repeat(out().store('a')).
cap('a').
unfold().
drop()
or perhaps slightly less readable imo:
g.V('dcb26be6-8d39-ae81-6ef2-6f60d06bce10').
emit().
repeat(out()).
fold().
unfold().
drop()
You may get a nicer answer - maybe even from me :)
You need a barrier step, which both fold and cap are - however both cause side effects (they cost memory/processing power). The barrier step seems like a better fit for this:
g.V('dcb26be6-8d39-ae81-6ef2-6f60d06bce10')
.emit()
.repeat(out())
.barrier()
.drop()

Gremlin query to recursively fetch nodes based on edge properties

Given the following sample data, I'd like to construct a Gremlin query which returns Alice's network of ruby connections, 3 levels deep:
Vertex: Alice
Vertex: Bobby
Vertex: Cindy
Vertex: David
Vertex: Eliza
Edge: [Alice] -> [Rates(tag:ruby,value:0.9)] -> [Bobby]
Edge: [Bobby] -> [Rates(tag:ruby,value:0.8)] -> [Cindy]
Edge: [Cindy] -> [Rates(tag:ruby,value:0.7)] -> [David]
Edge: [David] -> [Rates(tag:ruby,value:0.6)] -> [Eliza] # ignored, level 4
Edge: [Alice] -> [Rates(tag:java,value:0.9)] -> [Eliza] # ignored, not ruby
So the returned data should be something like:
Bobby: [0.9]
Cindy: [0.9, 0.8]
David: [0.9, 0.8, 0.7]
Where each vertex ID is returned, along with an array of the path of rating values.
I'm working in the current release of JanusGraph (Gremlin 3). I'm pretty new to Gremlin; I've been puzzling over a few recipes which have things in common with my desired query, but I still don't see quite how to get there...
Thanks very much for any help or advice you can offer.
When asking Gremlin questions it's always helpful to those trying to answer if you provide a sample graph that can be easily cut and paste into the Gremlin Console like this:
graph = TinkerGraph.open()
g = graph.traversal()
g.addV().property('name','alice').as('a').
addV().property('name','bobby').as('b').
addV().property('name','cindy').as('c').
addV().property('name','david').as('d').
addV().property('name','eliza').as('e').
addE('rates').property('tag','ruby').property('value',0.9).from('a').to('b').
addE('rates').property('tag','ruby').property('value',0.8).from('b').to('c').
addE('rates').property('tag','ruby').property('value',0.7).from('c').to('d').
addE('rates').property('tag','ruby').property('value',0.6).from('d').to('e').
addE('rates').property('tag','java').property('value',0.9).from('a').to('e').iterate()
Using this graph I came up with this approach to getting the result you desire:
gremlin> g.V().has('name','alice').
......1> repeat(outE().has('tag','ruby').inV()).
......2> times(3).
......3> emit().
......4> group().
......5> by('name').
......6> by(path().
......7> unfold().
......8> has('value').
......9> values('value').
.....10> fold())
==>[bobby:[0.9],cindy:[0.9,0.8],david:[0.9,0.8,0.7]]
Following up through line 3 with the emit() is probably pretty self-explanatory - find "alice" then traverse out() repeatedly to a depth of 3 and emit each vertex discovered along the way. That gets you the vertices you care about:
gremlin> g.V().has('name','alice').
......1> repeat(outE().has('tag','ruby').inV()).
......2> times(3).
......3> emit()
==>v[2]
==>v[4]
==>v[6]
The more complicated part comes after this where you are concerned about retrieving the path information for each so that you can grab the "value" properties along each "rates" edge. I chose to use group so that I could easily get the Map structure you wanted. Obviously, if "bobby" appeared twice in the tree you would end up with two lists of ratings for his Map entry.
If you pick apart what's happening in group() you can see that it is modulated by two by() options. The first corresponds to the key in the Map (obviously, i'm assuming uniqueness on "name"). The second extracts the path from the current traverser (the person vertex). Before going any further take a look at what the output looks like with just the path():
gremlin> g.V().has('name','alice').
......1> repeat(outE().has('tag','ruby').inV()).
......2> times(3).
......3> emit().
......4> group().
......5> by('name').
......6> by(path()).next()
==>bobby=[v[0], e[10][0-rates->2], v[2]]
==>cindy=[v[0], e[10][0-rates->2], v[2], e[11][2-rates->4], v[4]]
==>david=[v[0], e[10][0-rates->2], v[2], e[11][2-rates->4], v[4], e[12][4-rates->6], v[6]]
The steps that follow path() manipulate that path into the form you want. it unfolds each path then filters out the edges by looking for the edge only property of "value" and then extracts that and then folds the values back into a list for each value in the map.

Get the id + the map of a vertex on Gremlin?

g.v(1).id
gives me vertex 1 id,
g.v(1).map
gives me vertex 1 properties.
But, how can I get a hash with id and propeties at the same time
I know that it's an old question - so answers below will work on older versions of TinkerPop (3<); just if anyone (like me) stumbles upon this question and looks for a solution that works on TinkerPop 3 - the same result can be achieved by calling valueMap with 'true' argument, like this:
gremlin> g.v(1).valueMap(true)
reference may be found in docs here
As of Gremlin 2.4.0 you can also do something like:
gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> g.v(1).out.map('name','age','id')
==>{id=2, age=27, name=vadas}
==>{id=4, age=32, name=josh}
==>{id=3, age=null, name=lop}
Another alternative using transform():
gremlin> g.v(1).out.transform{[it.id,it.map()]}
==>[2, {age=27, name=vadas}]
==>[4, {age=32, name=josh}]
==>[3, {name=lop, lang=java}]
if implementing with Java use
g.V(1).valueMap().with(WithOptions.tokens).toList()
I've found a solution
tab = new Table()
g.v(1).as('properties').as('id').table(tab){it.id}{it.map}
tab
Just extending on #Stephen's answer; to get the id and the map() output in a nice single Map for each Vertex, just use the plus or leftShift Map operations in the transform method.
Disclaimer: I'm using groovy, I haven't been able to test it in gremlin (I imagine it's exactly the same).
Groovy Code
println "==>" + g.v(1).out.transform{[id: it.id] + it.map()}.asList()
or
println "==>" + g.v(1).out.transform{[id: it.id] << it.map()}.asList()
Gives
==>[[id:2, age:27, name:vadas], [id:4, age:32, name:josh], [id:3, name:lop, lang:java]]

Resources