How to getting paths that are self-loops using GremlinPipeline? - graph-databases

I'm working with a network that allows self-loops (i.e., some edges have the same vertex as both head and tail). Suppose, the graph g has 3 vertices (adam, bill, and cid), and 3 edges of type reports ([adam-reports->bill], [bill-reports->cid], and [adam-reports->adam]), the last being the only reflexive edge in this example.
gremlin> g = new TinkerGraph();
gremlin> adam = g.addVertex('adam');
gremlin> bill = g.addVertex('bill');
gremlin> cid = g.addVertex('cid');
gremlin> g.addEdge(adam, bill, 'reports');
gremlin> g.addEdge(bill, cid, 'reports');
gremlin> g.addEdge(adam, adam, 'reports');
In gremlin, the self-loop(s) can be easily retrieved, thus:
gremlin> g.V.sideEffect{v=it}.outE('reports').inV.filter{it==v}.path
gremlin> [v[adam], e[2][adam-reports->adam], v[adam]]
However, I'm trying to do the same using GremlinPipeline in Java without success. How can I build a valid GremlinePipeline to do the above?
GremlinPipeline pp = new GremlinPipeline();
// Add various pies to pp to get a valid pipeline
pp.setStarts(g.getVertices());

If you only want to find self-loop edges, do this:
g.E.filter{it.inV == it.outV}
Given you sample TinkerGraph above, the output is:
gremlin> g.E.filter{it.inV == it.outV}
==>e[2][adam-reports->adam]

Related

Tfidf with a custom list

I have a list of raw strings that look like this;
listtocheck = ['fadsfsfgblahsdfgsfg','adfaghelloggfg','gagfghellosdfhere','blahsgsdfgsdfhellohsdfhgshstring']
and I want to perform TfIdf with these and a list of items I have in a list (not itself).
mylist = ['blah','hello','here','string']
This list I am vectorising as such;
from sklearn.feature_extraction.text import TfidfVectorizer
tf = TfidfVectorizer(analyzer = 'char_wb', ngram_range=(2,3))
listvec = tf.fit_transform(mylist)
This gives me the tfidf of the things in mylist. What I would like to be able to go is to check the number of times that the ngrams from mylist appear in each item of listtocheck and then perform TfIdf based on the total number times that ngram appears in all of the strings in listtocheck
In order to achieve this I had to first .fit() on mylist but then .transform() on listtocheck.
Here is the code I used in the end:
from sklearn.feature_extraction.text import TfidfVectorizer
def create_vec(listtocheck,mylist):
tf = TfidfVectorizer(analyzer = 'char_wb',ngram_range=(2,3))
tf.fit(mylist)
X = tf.transform(listtocheck)
return X
vecs = create_vec(listtocheck, mylist)

How to "join" vertices and the count of their edges as a 'property' of those vertices in JanusGraph or Gremlin?

I need to return the 'posts' vertices, but those posts have some 'like' edges, how can I return the count of 'likes' edges for that posts as a property of that edge, like this:
{ title: 'lorem ipsum.....',
content: 'yadayadayada',
likes: 6 <----
}
Using TinkerPop's modern toy graph as an example, you could do something like this:
gremlin> g.V().as('a').
......1> map(outE('created').count()).as('count').
......2> select('a','count').by(valueMap()).by()
==>[a:[name:[marko],age:[29]],count:1]
==>[a:[name:[vadas],age:[27]],count:0]
==>[a:[name:[lop],lang:[java]],count:0]
==>[a:[name:[josh],age:[32]],count:2]
==>[a:[name:[ripple],lang:[java]],count:0]
==>[a:[name:[peter],age:[35]],count:1]
It returns the properties of the vertices in "a" and the count of "created" edges. You might also choose to use project():
gremlin> g.V().
......1> project('a','knows','created').
......2> by(valueMap()).
......3> by(outE('knows').count()).
......4> by(outE('created').count())
==>[a:[name:[marko],age:[29]],knows:2,created:1]
==>[a:[name:[vadas],age:[27]],knows:0,created:0]
==>[a:[name:[lop],lang:[java]],knows:0,created:0]
==>[a:[name:[josh],age:[32]],knows:0,created:2]
==>[a:[name:[ripple],lang:[java]],knows:0,created:0]
==>[a:[name:[peter],age:[35]],knows:0,created:1]

Apex - Retrieving Records from a type of Map<SObject, List<SObject>>

I am using a lead map where the first id represents an Account ID and the List resembles a list of leads linked to that account such as: Map<id, List<Id> > leadMap = new Map< id, List<id> >();
My question stands as following: Knowing a Lead's Id how do I get the related Account's Id from the map. My code looks something like this, The problems is on the commented out line.
for (Lead l : leads){
Lead newLead = new Lead(id=l.id);
if (l.Company != null) {
// newLead.Account__c = leadMap.keySet().get(l.id);
leads_to_update.add(newLead);
}
}
You could put all lead id and mapping company id in the trigger then get the company id
Map<string,string> LeadAccountMapping = new Map<string,string>();//key is Lead id ,Company id
for(Lead l:trigger.new)
{
LeadAccountMapping.put(l.id,l.Company);
}
//put the code you want to get the company id
string companyid= LeadAccountMapping.get(l.id);
Let me make sure I understand your problem.
Currently you have a map that uses the Account ID as the key to a value of a List of Lead IDs - So the map is -> List. Correct?
Your goal is to go from Lead ID to the Account ID.
If this is correct, then you are in a bad way, because your current structure requires a very slow, iterative search. The correct code would look like this (replace your commented line with this code):
for( ID actID : leadMap.keySet() ) {
for( ID leadID : leadMap.get( actId ) ) {
if( newLead.id == leadID ) {
newLead.Account__c = actId;
leads_to_update.add(newLead);
break;
}
}
}
I don't like this solution because it requires iterating over a Map and then over each of the lists in each of the values. It is slow.
If this isn't bulkified code, you could do a Select Query and get the Account__c value from the existing Lead by doing:
newLead.Account__c = [ SELECT Account__c FROM Lead WHERE Id = :l.id LIMIT 1];
However, this relies on your code not looping over this line and hitting a governor limit.
Or you could re-write your code soe that your Map is actually:
Map<ID, List<Leads>> leadMap = Map<ID, List<Leads>>();
Then in your query where you build the map you ensure that your Lead also includes the Account__c field.
Any of these options should work, it all depends on how this code snippet in being executed and where.
Good luck!

Solr query to find one letter without other letter around

I have documents in my solr already indexed. I want to find Producer and model in tire.
I have file with producer and model like this:
Nokian;WR G2 SUV
Nokian;WR SUV
Nokian;V
Query:
((productname:"NOKIAN" OR producer:"NOKIAN") AND (productname:"V" OR description:"V" OR referencenumber:"V"))
But it found for example this:
"2X NOKIAN 215/55 R17 94V LINE (3)"
Because in this product speed index is V and here model is Line. My algorithm take this product for Nokian;V not for Nokian;Line.
How to ask solr to gives me only this product where this V don't have any other letters around?
LETNIE 225/45/17 94V NOKIAN V FINLAND - PŁOTY
This found beautiful. Its Nokian;V.
As far as I understand your question you need to put MUST quantifier before each boolean clause. So query will look like:
(
+(productname:"NOKIAN" OR producer:"NOKIAN") AND
+(productname:"V" OR description:"V" OR referencenumber:"V")
)
If your productname field is of type text it has the WordDelimiterFilter in the analysis chain. One of the default behaviors of this filter is to split terms on letter-number boundaries causing:
2X NOKIAN 215/55 R17 94V LINE (3)
to generate the following tokens:
2 X NOKIAN 215 55 R 17 94 V LINE 3
(which matches the "V" in your query).
You can always run debug=results to get an explanation for why something matches. I think in this particular case, you might construct another field type for your productname field that analyzes your model string less aggressively.
I solved the problem in such a way that sorted out brand,model Dictionary. I used my own comparer.
public class MyComparer : IComparer<string>
{
int IComparer<string>.Compare(string x, string y)
{
if (x == y)
{
return 0;
}
if (x.Contains(y))
{
return -1;
}
else
{
return 1;
}
}
}
All model that have V or H now are on the end of Dcitionary. It's works very well. Because first solr searched Nokian;Line and this product where found add to other list alreadyFound and skip this product where found model. Thanks all for your reply.

Graph DB: Sort product based on likes

I have a product vertex which has incomming like edge.
User ------- likes ----------->products
In my search result I want to sort the products based on likes. How this can be done ?
Just use groupCount:
gremlin> g = new TinkerGraph()
==>tinkergraph[vertices:0 edges:0]
gremlin> user1 = g.addVertex('u1')
==>v[u1]
gremlin> user2 = g.addVertex('u2')
==>v[u2]
gremlin> product1 = g.addVertex('p1')
==>v[p1]
gremlin> product2 = g.addVertex('p2')
==>v[p2]
gremlin> product3 = g.addVertex('p3')
==>v[p3]
gremlin> user1.addEdge('like',product1)
==>e[0][u1-like->p1]
gremlin> user1.addEdge('like',product2)
==>e[1][u1-like->p2]
gremlin> user2.addEdge('like',product2)
==>e[2][u2-like->p2]
gremlin> user2.addEdge('like',product3)
==>e[3][u2-like->p3]
gremlin> g.v('u1','u2').out('like').groupCount().sort{-it.value}
Cannot invoke method negative() on null object
Display stack trace? [yN] n
gremlin> g.v('u1','u2').out('like').groupCount().cap.next().sort{-it.value}
==>v[p2]=2
==>v[p1]=1
==>v[p3]=1

Resources