gremlin: limit step vs take step - graph-databases

I am new to gremlin and while referring to this website, I came across take() step. It has the same output as limit() which makes me wonder what is the difference between the two. I am unable to find any clarification regarding this matter. Thanks!

Unfortunately that's a bit confusing. take() is not a Gremlin step. It is instead a Groovy function that is being applied to the end of a traversal (which itself is an Iterator). In much the same way that you can use take() at then end of a traversal you can use other Groovy functions:
gremlin> g.V().take(1)
==>v[1]
gremlin> g.V().collect{it.value('name')}
==>marko
==>vadas
==>lop
==>josh
==>ripple
==>peter
Of course, once you use a Groovy function to process the pipeline you can't go back to Gremlin steps:
gremlin> g.V().take(1).out()
No signature of method: org.codehaus.groovy.runtime.DefaultGroovyMethods$TakeIterator.out() is applicable for argument types: () values: []
Possible solutions: sum(), sort(), sort(groovy.lang.Closure), sort(java.util.Comparator), count(java.lang.Object), count(groovy.lang.Closure)
Type ':help' or ':h' for help.
Display stack trace? [yN]
which is why you would prefer limit(1):
gremlin> g.V().limit(1).out()
==>v[3]
==>v[2]
==>v[4]
Of course, if you're not using Groovy and are programming in a Java environment then it would be obvious that take() and other such functions aren't going to be available to you.

The limit() step should be used and is, as of TinkerPop 3.4, the canonical way to iterate a Traversal and retrieve the first n elements.
I can't remember why take() was available on Traversal instances at the time of writing of this article. This sounds a bit odd to me; it could be an Iterator (or comparable) interface leaking, but I'll let maintainers comment on this if they read this question.
You'll be safe with limit().

Related

Implementing Intelligent design sort

This might be frivolous question, so please have understanding for my poor soul.
After reading this article about Intelligent Design sort (http://www.dangermouse.net/esoteric/intelligentdesignsort.html) which is in no way made to be serious in any way, I started wondering whether this could be possible.
An excerpt from article says:
The probability of the original input list being in the exact order it's in is 1/(n!). There is such a small likelihood of this that it's clearly absurd to say that this happened by chance, so it must have been consciously put in that order by an intelligent Sorter.
Let's for a second forget about intelligent Sorter, and think about possibility that random occurrences of members in array are in some way sorted. Our algorithm should determine the pattern without changing array's structure.
Is there any way to do this? Speed is not a requirement.
The implementation is very easy actually. The entire point of the article is that you don't actually sort anything. In other words, a correct implementation is a simple NOP. As my preferred language is Java, I'll show a simple in-place implementation in Java as a lambda function:
list->{}
Funny article, I had a good laugh.
If the only thing you're interested in is that whether your List is sorted, then you could simply keep an internal sorted flag (defaulted to true for an empty list) and override your add() method to check if the element you're adding fits the ordering of the List - that is, compare it to the adjacent elements and setting the sorted flag appropriately.

Graph-Traversal: How do I query for "friends and friends of friends" using Gremlin

In my graph database I have Branches and Leaves. Branches can "contain" Leaves and Branches can "contain" Branches.
How, using Gremlin, can I find all leaves for a given branch, that are directly or indirectly related to it?
I got this to work in Cypher:
START v=node(1) MATCH v-[:contains*1..2]->i RETURN v,i
Where the *1..2 means "friends and friends of friends".
I thought maybe LoopV was the way forward, but I just get an Exception:
Error reading JArray from JsonReader. Current JsonReader item is not an array: String
You can do the following in Gremlin 1.4+.
g.v(1).out('contains').loop(1){true}{it.out('contains').count() == 0}
This says:
Start at vertex with id 1
Take the outgoing "contains" edges.
Loop over the out('contains') section.
Loop "infinitely" (make sure your tree doesn't have loops in it)
Emit only those vertices touched that don't have more outgoing 'contains'-edges. (i.e. the leaves)
However, looking at what you wanted from Cypher, it looks like you only want 2 steps. Thus, to do that, simply do:
g.v(1).out('contains').loop(1){it.loops < 3}
Perhaps I misunderstood your question --- either way, that should give you enough to play with.

OWL2 RL via RETE algorithm

I am currently trying to implement OWL2 RL via Rete algorithm. I have run into the following issue: How to implement lists needed for example in this rule: eq-diff2 (W3C reccomendation)?
Thanks.
I have developed this solution.
Before inference construct the lists in memory. It is simple,
because the elements can be easily identified.
Construct RETE nodes for first m rules, which don't need "loop" construct
Put an action in the last node:
Add new Rete (alpha+beta) nodes for the corresponding list (you will always know which, because it's one of the "static" rules)
Put corresponding WMEs into newly created alpha memories
Activate Beta nodes
It is probably possible to remove the whole "dynamic" branch after the final action is performed.

How to write Haskell array strategies

I want to write a strategy to evaluate items in an array in parallel. The old strategies had parArr to do this (see here). But this is not found in the new Control.Parallel.Strategies module.
E.g.
parallel list evaluation: map f myList `using` parList rdeepseq
I would want to be able to do something like: amap f myArr `using` parArr rdeepseq, where amap is from Data.Array.Base and applies a function to each of the elements (sequentially).
The following seems to work but I wonder if it is doing it right, and want to know how I could define my own parArr.
This works: amap ((+1) `using` rpar) $ Array.array (0,4) [(0,10),(1,20),(2,30),(3,40),(4,50)]
For a previous question, I wrote a parallel evaluation strategy for the vector package. That should be a good place to start. You can see the code on hackage in the vector-strategies package.
I don't have time to give a full answer - perhaps I'll edit this later. Feel free to comment with extra questions and direction.
Apart from all the good advice given: The reason that there is no parArr anymore is simply that it has been replaced by the more general parTraversable. Just say:
amap f myArr `using` parTraversable rdeepseq
That should give you the behavior you asked for.

MD5 code kata and BDD

I was thinking to implement MD5 as a code kata and wanted to use BDD to drive the design (I am a BDD newb).
However, the only test I can think of starting with is to pass in an empty string, and the simplest thing that will work is embedding the hash in my program and returning that.
The logical extension of this is that I end up embedding the hash in my solution for every test and switching on the input to decide what to return. Which of course will not result in a working MD5 program.
One of my difficulties is that there should only be one public function:
public static string MD5(input byte[])
And I don't see how to test the internals.
Is my approach completely flawed or is MD5 unsuitable for BDD?
I believe you chose a pretty hard exercise for a BDD code-kata. The thing about code-kata, or what I've understood about it so far, is that you somehow have to see the problem in small incremental steps, so that you can perform these steps in red, green, refactor iterations.
For example, an exercise of finding an element position inside an array, might be like this:
If array is empty, then position is 0, no matter the needle element
Write test. Implementation. Refactor
If array is not empty, and element does not exist, position is -1
Write test. Implementation. Refactor
If array is not empty, and element is the first in list, position is 1
Write test. Implementation. Refactor
I don't really see how to break the MD5 algorithm in that kind of steps. But that may be because I'm not really an algorithm guy. If you better understand the steps involved in the MD5 algorithm, then you may have better chances.
It depends on what you mean with unsuitable... :-) It is suitable if you want to document a few examples that describes your implementation. It should also be possible to have the algorithm emerge from your specifciation if you add one more character for each test.
By just adding a switch statement you're just trying to "cheat the system". Using BDD/TDD does not mean you have to implement stupid things. Also the fact that you have hardcoded hash values as well as a switch statement in your code are clear code smells and should be refactored and removed. That is how your algorithm should emerge because when you see the hard coded values you first remove them (by calculating the value) and then you see that they are all the same so you remove the switch statement.
Also if your question is about finding good katas I would recommend lokking in the Kata catalogue.

Resources