GroupBy query with list of vertices - graph-databases

Suppose I want to query the Neptune graph with "group-by" on one property (or more), and I want to get back the list of vertices too.
Let's say, I want to group-by on ("city", "age") and want to get the list of vertices too:
[
{"city": "SFO", "age": 29, "persons": [v[1], ...]},
{"city": "SFO", "age": 30, "persons": [v[10], v[13], ...]},
...
]
Or, get back the vertex with its properties (as valueMap):
[
{"city": "SFO", "age": 29, "persons": [[id:1,label:person,name:[marko],age:[29],city:[SFO]], ...]},
...
]
AFAIK, Neptune doesn't support lambda nor variable assignments. is there a way to do this with one traversal and no lambdas?
Update: I'm able to get the vertices, but without their properties (with valueMap).
Query:
g.V().hasLabel("person").group().
by(values("city", "age").fold()).
by(fold().
match(__.as("p").unfold().values("city").as("city"),
__.as("p").unfold().values("age").as("age"),
__.as("p").fold().unfold().as("persons")).
select("city", "age", "persons")).
select(values).
next()
Output:
==>[city:SFO,age:29,persons:[v[1]]]
==>[city:SFO,age:27,persons:[v[2],v[23]]]
...

If I understand it correctly, then ...
g.V().hasLabel("person").
group().
by(values("city", "age").fold())
... or ...
g.V().hasLabel("person").
group().
by(valueMap("city", "age").by(unfold()))
... already gives you what you need, it's just about reformating the result. To merge the maps in keys and values together, you can do something like this:
g.V().hasLabel("person").
group().
by(valueMap("city", "age").by(unfold())).
unfold().
map(union(select(keys),
project("persons").
by(values)).
unfold().
group().
by(keys).
by(select(values)))
Executing this on the modern toy graph (city replaced with name) will yield the following result:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().hasLabel("person").
......1> group().
......2> by(valueMap("name", "age").by(unfold())).
......3> unfold().
......4> map(union(select(keys),
......5> project("persons").
......6> by(values)).
......7> unfold().
......8> group().
......9> by(keys).
.....10> by(select(values)))
==>[persons:[v[2]],name:vadas,age:27]
==>[persons:[v[4]],name:josh,age:32]
==>[persons:[v[1]],name:marko,age:29]
==>[persons:[v[6]],name:peter,age:35]

Related

Turn nested lists into a table Python

sorry for my annoying questions again.
Having a bit of trouble with some code. The task is to write a function that takes a nested list of currency conversions and turn it into a table (I've attached some pictures for clarification)
(could only attach one image, this is the nested list it converts)
[[10, 9.6, 7.5, 6.7, 4.96], [20, 19.2, 15.0, 13.4, 9.92], [30, 28.799999999999997, 22.5, 20.1, 14.879999999999999], [40, 38.4, 30.0, 26.8, 19.84], [50, 48.0, 37.5, 33.5, 24.8], [60, 57.599999999999994, 45.0, 40.2, 29.759999999999998], [70, 67.2, 52.5, 46.900000000000006, 34.72], [80, 76.8, 60.0, 53.6, 39.68], [90, 86.39999999999999, 67.5, 60.300000000000004, 44.64], [100, 96.0, 75.0, 67.0, 49.6]]
I've got the Header column for the table to work fine.
I'm having issues when I'm trying to iterate over each sublist in the nested list, convert it to a string (and two decimal places) and with a tab between each entry.
the code I've got so far is:
def printTable(cur):
list2 = makeTable(cur)
lst1 = Extract(cur)
lst1.insert(0, "$NZD")
lne1 = "\t".join(lst1)
print(lne1)
list=map(str,list2)
print(list2)
for list in list2:
for elem in list:
linelem = "\t".join(elem)
print(linelem)
printTable(cur)
(Note: The first function I call and assign to list2 is what generates the data/nested list)
I've tried playing around a bit but I keep coming up with different error messages trying to convert each sublist to a string.
Thank you all for your help!![enter image description here][1]
Try using this method;
import pandas as pd
import csv
with open("out.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(a)
pd.read_csv("out.csv", header = [col1, col2, col3, col4, col5])

Conditionally set a property value on edge when adding a vertex [GREMLIN API]

Im trying to add a vertex that will be linked to another vertex with a conditional property value in between their edges.
So far this is what i came up with:
- this runs with no errors but im not able to get any results.
g.V().has('label', 'product')
.has('id', 'product1')
.outE('has_image')
.has('primary', true)
.inV()
.choose(fold().coalesce(unfold().values('public_url'), constant('x')).is(neq('x')))
.option(true,
addV('image')
.property('description', '')
.property('created_at', '2019-10-31 09:08:15')
.property('updated_at', '2019-10-31 09:08:15')
.property('pk', 'f920a210-fbbd-11e9-bed6-b9a9c92913ef')
.property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png')
.V()
.hasLabel('product')
.has('id', 'product1')
.addE('has_image')
.property('primary', false))
.option(false,
addV('image')
.property('description', '')
.property('created_at', '2019-10-31 09:08:15')
.property('updated_at', '2019-10-31 09:08:15')
.property('pk', 'f920a930-fbbd-11e9-b444-8bccc55453b9')
.property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png')
.V()
.hasLabel('product')
.has('id', 'product1')
.addE('has_image')
.property('primary', true))
What im doing here is im trying to set the primary property of newly added edge in between image vertex and product vertex, depending on whether a product is already connected to an image where the edge already has a primary set to true.
if a product already has an image with an edge property: primary:true then the newly added image that will be linked to the product should have an edge with property primary:false
Seed azure graphdb:
//add product vertex
g.addV('product').property(id, 'product1').property('pk', 'product1')
//add image vertex
g.addV('image').property(id, 'image1').property('public_url', 'url_1').property('pk', 'image1');
//link products to images
g.V('product1').addE('has_image').to(V('image1')).property('primary', true)
I'm surprised that your traversal runs without errors as I hit several syntax problems around your use of option() and some other issues with your mixing of T.id and the property key of "id" (the latter of which might be part of your issue in why this didn't work as-is, but I'm not completely sure). Of course, I didn't test on CosmosDB, so perhaps they took such liberties with the Gremlin language.
Anyway, assuming I have followed your explanation correctly, I think there is a way to vastly simplify your Gremlin. I think you just need this:
g.V('product1').as('p').
addV('image').
property('description', '').
property('created_at', '2019-10-31 09:08:15').
property('updated_at', '2019-10-31 09:08:15').
property('pk', 'f920a210-fbbd-11e9-bed6-b9a9c92913ef').
property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png').
addE('has_image').
from('p').
property('primary', choose(select('p').outE('has_image').values('primary').is(true),
constant(false), constant(true)))
Now, I'd say that this is the most idiomatic approach for Gremlin and as I've not tested on CosmosDB I can't say if this approach will work for you but perhaps looking at my console session below you can see that it does satisfy your expectations:
gremlin> g.V('product1').as('p').
......1> addV('image').
......2> property('description', '').
......3> property('created_at', '2019-10-31 09:08:15').
......4> property('updated_at', '2019-10-31 09:08:15').
......5> property('pk', 'f920a210-fbbd-11e9-bed6-b9a9c92913ef').
......6> property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png').
......7> addE('has_image').
......8> from('p').
......9> property('primary', choose(select('p').outE('has_image').values('primary').is(true), constant(false), constant(true)))
==>e[31][product1-has_image->25]
gremlin> g.E().elementMap()
==>[id:31,label:has_image,IN:[id:25,label:image],OUT:[id:product1,label:product],primary:true]
gremlin> g.V('product1').as('p').
......1> addV('image').
......2> property('description', '').
......3> property('created_at', '2019-10-31 09:08:15').
......4> property('updated_at', '2019-10-31 09:08:15').
......5> property('pk', 'f920a210-fbbd-11e9-bed6-b9a9c92913ef').
......6> property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png').
......7> addE('has_image').
......8> from('p').
......9> property('primary', choose(select('p').outE('has_image').values('primary').is(true), constant(false), constant(true)))
==>e[38][product1-has_image->32]
gremlin> g.E().elementMap()
==>[id:38,label:has_image,IN:[id:32,label:image],OUT:[id:product1,label:product],primary:false]
==>[id:31,label:has_image,IN:[id:25,label:image],OUT:[id:product1,label:product],primary:true]
gremlin> g.V('product1').as('p').
......1> addV('image').
......2> property('description', '').
......3> property('created_at', '2019-10-31 09:08:15').
......4> property('updated_at', '2019-10-31 09:08:15').
......5> property('pk', 'f920a210-fbbd-11e9-bed6-b9a9c92913ef').
......6> property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png').
......7> addE('has_image').
......8> from('p').
......9> property('primary', choose(select('p').outE('has_image').values('primary').is(true), constant(false), constant(true)))
==>e[45][product1-has_image->39]
gremlin> g.E().elementMap()
==>[id:38,label:has_image,IN:[id:32,label:image],OUT:[id:product1,label:product],primary:false]
==>[id:45,label:has_image,IN:[id:39,label:image],OUT:[id:product1,label:product],primary:false]
==>[id:31,label:has_image,IN:[id:25,label:image],OUT:[id:product1,label:product],primary:true]
If that looks right and this doesn't work properly in CosmosDB, it is because of line 9 which utilizes a Traversal as an argument to property() which isn't yet supported in CosmosDB. The remedy is to simply invert that line a bit:
g.V('product1').as('p').
addV('image').
property('description', '').
property('created_at', '2019-10-31 09:08:15').
property('updated_at', '2019-10-31 09:08:15').
property('pk', 'f920a210-fbbd-11e9-bed6-b9a9c92913ef').
property('path', 'product_images/87wfMABXBodgXL1O4aIf6BcMMG47ueUztjNCkGxP.png').
addE('has_image').
from('p').
choose(select('p').outE('has_image').values('primary').is(true),
property('primary', false),
property('primary', true))
I find this approach only slightly less readable as the property() doesn't align with the addE() but, it's not a terrible alternative.

Gremlin: division after groupCount

I am using Gremlin to query Neptune.
I have 2 counts
g.V().hasLabel(*).outE.inV().groupCount().by('name')
result is like : 'a':2, 'b':4
g.V().hasLabel(*).count()
4
How can I write a single query to get the numbers that result 1 divided by result 2? i.e. 'a': 0.5, 'b': 1
I can think of a few ways, but I guess using match() is the easiest:
g.V().hasLabel(*).
union(count(),
out().groupCount().by('name')).fold().
match(__.as('values').limit(local, 1).as('c'),
__.as('values').tail(local, 1).unfold().as('kv'),
__.as('kv').select(values).math('_/c').as('v')).
group().
by(select('kv').by(keys)).
by(select('v'))
A similar query on the modern graph:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().union(count(),
......1> out().groupCount().by(label)).fold().
......2> match(__.as('values').limit(local, 1).as('c'),
......3> __.as('values').tail(local, 1).unfold().as('kv'),
......4> __.as('kv').select(values).math('_/c').as('v')).
......5> group().
......6> by(select('kv').by(keys)).
......7> by(select('v'))
==>[software:0.6666666666666666,person:0.3333333333333333]
The next one is probably harder to understand, but would be my personal favorite (because a) I don't like match() and b) it doesn't rely on the order of the results returned by union()):
gremlin> g.V().
......1> groupCount('a').
......2> by(constant('c')).
......3> out().
......4> groupCount('b').
......5> by(label).
......6> cap('a','b').as('x').
......7> select('a').select('c').as('c').
......8> select('x').select('b').unfold().
......9> group().
.....10> by(keys).
.....11> by(select(values).math('_/c'))
==>[software:0.6666666666666666,person:0.3333333333333333]

Analysing the edges between two vertices gremlin

Here is my graph
g.addV('user').property('id',1).as('1').
addV('user').property('id',2).as('2').
addV('user').property('id',3).as('3').
addE('follow').from('1').to('2').
addE('follow').from('1').to('3').iterate()
The below is my approach when a user wants to follow another user suppose 2 wants to follow 3
I'm checking first whether follow edge exist between 2 and 3
if(g.V().has(id, 2).outE(follow).inV().has(id, 3).hasNext())
{
//if exists that means he already following him so i'm dropping the follow edge and adding unfollow edge to 2,3.
}
else if(g.V().has(id, 2).outE(unfollow).inV().has(id, 3).hasNext())
{
//if exists he already unfollowed him and he wants to follow him again i'm dropping the unfollow edge and adding the follow edge to 2,3.
}
else
{
// there is no edges between 2,3 so he is following him first so i'm adding follow edge 2,3.
}
but the drawback of this approach is every time it needs to query 2 times which impacts performance . Can you suggest me a better approach ?
You can build if-then-else semantics with choose(). A direct translation of your logic there would probably look like this:
gremlin> g.addV('user').property(id,1).as('1').
......1> addV('user').property(id,2).as('2').
......2> addV('user').property(id,3).as('3').
......3> addE('follow').from('1').to('2').
......4> addE('follow').from('1').to('3').iterate()
gremlin> g.V(3).as('target').
......1> V(2).as('source').
......2> choose(outE('follow').aggregate('d1').inV().hasId(3),
......3> sideEffect(addE('unfollow').from('source').to('target').
......4> select('d1').unfold().drop()).constant('unfollowed'),
......5> choose(outE('unfollow').aggregate('d2').inV().hasId(3),
......6> sideEffect(addE('follow').from('source').to('target').
......7> select('d2').unfold().drop()).constant('followed'),
......8> addE('follow').from('source').to('target').constant('followed-first')))
==>followed-first
gremlin> g.E()
==>e[0][1-follow->2]
==>e[1][1-follow->3]
==>e[2][2-follow->3]
gremlin> g.V(3).as('target').
......1> V(2).as('source').
......2> choose(outE('follow').aggregate('d1').inV().hasId(3),
......3> sideEffect(addE('unfollow').from('source').to('target').
......4> select('d1').unfold().drop()).constant('unfollowed'),
......5> choose(outE('unfollow').aggregate('d2').inV().hasId(3),
......6> sideEffect(addE('follow').from('source').to('target').
......7> select('d2').unfold().drop()).constant('followed'),
......8> addE('follow').from('source').to('target').constant('followed-first')))
==>unfollowed
gremlin> g.E()
==>e[0][1-follow->2]
==>e[1][1-follow->3]
==>e[3][2-unfollow->3]
gremlin> g.V(3).as('target').
......1> V(2).as('source').
......2> choose(outE('follow').aggregate('d1').inV().hasId(3),
......3> sideEffect(addE('unfollow').from('source').to('target').
......4> select('d1').unfold().drop()).constant('unfollowed'),
......5> choose(outE('unfollow').aggregate('d2').inV().hasId(3),
......6> sideEffect(addE('follow').from('source').to('target').
......7> select('d2').unfold().drop()).constant('followed'),
......8> addE('follow').from('source').to('target').constant('followed-first')))
==>followed
gremlin> g.E()
==>e[0][1-follow->2]
==>e[1][1-follow->3]
==>e[4][2-follow->3]

loop through implicit array

I've been stuck on this for awhile now. I'm trying to loop through this array so I can perform some calculations but I cannot figure out how to loop through there values. Any suggestions?
I managed to figure out how to get there collection structures but I want to loop through each structure and grab there values as well and thats what I'm stuck on.
Also, I want to refrain from using cfscript if possible as I'm still in the learning stages of learning coldfusion.
Here is my code:
<cfset houseStuff = {
Bedroom = [
'Luxury Duvet Set with Alternative Down Comforter',
'Accent Coverlet & Shams',
'Two Sets of Luxurious Liens',
'Mattress Pad',
'Blanket',
'Six Bed Pillows',
'Clock Radio',
'Twenty Hangers'
],
Bathroom = [
'Four Bath Towels',
'Four Hand Towels',
'Four Face Towels',
'Bath Rug',
'Shower Curtain',
'Stainless Tooth Brush Holder & Soap Dish',
'Wastebasket',
'Artwork',
'Hair Dryer',
'Toilet Brush & Plunger'
],
Dining = [
'Dinnerware',
'Place Mats',
'Napkins',
'Flatware',
'Glassware & Wine Glasses'
],
Kitchen = [
'Microwave',
'Cookware',
'Mixing Bowls',
'Baking Dish',
'Colander',
'Stainless Utensil Holder',
'Large Fork',
'Large Spoon',
'Spatula',
'Whisk',
'Measuring Spoon & Cup',
'Carving & Paring Knives',
'Four Steak Knives',
'Cutting Board',
'Salt & Pepper Set',
'Wine Opener',
'Coffee Maker',
'Toaster',
'Electric Can Opener',
'Flatware Tray',
'Kitchen Wastebasket',
'Dish Towels',
'Pot Holders',
'Pitcher',
'10" Non-Stick Frying Pan',
'Cookie Sheet',
'Stainless Steel Electric Tea Kettle',
'3 Piece Non-Metal (Spatula, Spoon, Paste Spoon) Combo'
],
Micellaneous = [
'Iron & Cutting Board',
'Cordless Dual Phone with Digital Answering Machine',
'Broom',
'Dust Pan',
'Vacuum',
'Decor',
'Laundry Basket'
],
StarterKit = [
'Bath Tissue',
'Soap',
'Shampoo & Conditioner',
'Paper Towels',
'Sponge',
'Laundry Soap',
'Dishwasher Detergent',
'Liquid Dish Soap',
'Extra Light Bulbs',
'Coffee',
'Sugar',
'Creamer',
'Bottled Water',
'Oatmeal',
'Breakfast Bars',
'Peanuts',
'Chips',
'Mints',
'Welcome Information'
],
MasterBedroom = [
'Queen bed',
'Headboard',
'Two Nightstands',
'Dresser & Mirrior',
'Two Lamps',
'Artwork',
'LCD Television'
],
LivingRoom = [
'Sofa',
'Chair',
'End Table',
'Coffee Table',
'Lamp',
'LCD TV w/stand',
'DVD Player',
'Artwork'
],
DiningRoom = [
'Dining Table',
'Dining Chairs',
'Artwork'
],
OfficePackage = [
'Desk',
'Chair',
'Lamp'
],
AdditionalBedrooms = [
'Queen or Two Twin Beds',
'Headboard',
'Nightstand',
'Chest of Drawers',
'Lamp',
'Artwork'
]
} />
<cfloop collection="#houseStuff#" item="key">
<cfdump var="#key#"> <br>
<!--- <p style="color:##fff;">#key#:</p> <br /> --->
</cfloop>
Nevermind, I finally figured it out. I had to loop through the collection first. Once I do that create another loop inside it to loop over it's structured values.
<cfloop collection="houseStuff" item="key">
<!---<cfdump var="#houseStuff[key]"> --->
<cfloop from="1" to="#arrayLen(houseStuff[key])#" index="j">
#j#
</cfloop>
</cfloop>
I know you said you'd prefer tags instead of script, but if you are in the learning stages of ColdFusion, I'd still recommend learning how to properly use cfscript. In addition to making your CF a little bit cleaner, it will also make your life a lot easier, especially for things like looping.
Outputting all elements becomes:
<cfscript>
for ( i in houseStuff ) { // loop over the outer Structure
writeOutput(i & ":<br>") ;
for ( j in houseStuff[i] ) { // loop over each inner Array key
writeOutput(j & "<br>") ;
}
writeOutput("<br>");
}
</cfscript>
https://trycf.com/gist/898988f6969a57aa5dece39c42037cfd/acf?theme=monokai
... which, in this context, does get into the philosophical discussion of whether to write output code in tags or script and goes slightly beyond the scope of this question. But I've always been a proponent of learning best-practices at the same time as the basics. Personally, I do tend to follow the tags-for-output view, but for basic looping, the script version is a bit cleaner to me. I'd learn both.
Also check out: http://www.learncfinaweek.com. There's a section in there on Looping with both methods.

Resources