Recursive query in GRAQL? - graph-databases

Is there a way to define a recursive query in GRAQL, i.e. to match a pattern where an exact predicate path between entities is unknown (for example, how many of them)?
SPARQL added support for these in 1.1 version. Example from the Apache Jena Documentation:
# Find the types of :x, following subClassOf
SELECT *
{
:x rdf:type/rdfs:subClassOf* ?t
}
CYPHER also allows them from its inception. Example:
MATCH (alice:Person { name:"Alice" })-[:friend *1..2]->(friend:Person)
RETURN friend.name;
Is it possible to do something similar in GRAQL?

It is possible to achieve this in Graql, using Grakn's reasoning engine.
Graql match queries don't support a looping query syntax (yet, but it is planned), but you can define recursive logic in Grakn using a rule. To achieve recursion there should be a rule which contains in its when something of the same type as is inferred in a rule's then.
In Graql this exact friend example goes as follows. This example doesn't use recursion as we are only looking for 1 or 2-hop looping.
First you need a schema:
define
name sub attribute, value string;
person sub entity,
has name,
plays friend;
friend sub relation,
relates friend;
If this is your starting schema you'll need to extend it as follows to add recursion in a new n-degree-friendship relation:
define
friend-of-a-friend isa relation,
relates connected-friend;
person sub entity,
plays connected-friend;
infer-friend-of-a-friend sub rule,
when {
(friend: $p1, friend: $p2) isa friendship;
(friend: $p2, friend: $p3) isa friendship;
}, then {
(connected-friend: $p1, connected-friend: $p3) isa friend-of-a-friend;
};
Then you can query for friends connected by any number of friendship relations as:
match $p1 isa person, has name "John";
$p2 isa person, has name $n;
{ (connected-friend: $p1, connected-friend: $p2) isa friend-of-a-friend; }
or { (friend: $p1, friend: $p2) isa friendship; };
get $n;
This friend example isn't recursive, but it can be extended to be recursive. What Grakn cannot currently support is the number of times to loop.
We can see a great example of recursion in the subClassOf example though:
define
class sub entity,
plays subclass,
plays superclass;
class-hierarchy sub relation,
relates subclass,
relates superclass;
class-hierarchy-is-recursive sub rule,
when {
(subclass: $c1, superclass: $c2) isa class-hierarchy;
(subclass: $c2, superclass: $c3) isa class-hierarchy;
}, then {
(subclass: $c1, superclass: $c3) isa class-hierarchy;
};
Then match to find all subclasses of x:
match $x isa class;
$y isa class;
(superclass: $x, subclass: $x) isa subclass;
get $y;

Related

Incorporating fuzzy search in a matcher object

My task is querying medical texts for institute names using a rule as below:
[{'ENT_TYPE': 'institute_name'}, {'TEXT': 'Hospital'}]
The rule will only idenify a match if both terms are included therein. Thus, it will accept "Mount Sinai Hospital", but not "Mount Sinai". I've tried spaczz that wraps spaCy and works great for single term or phrase. However neither spaCy not spaczz allow for a fuzzy multi-words rule with more than one typo as in "Moung Sinai Mospital."
Therefore, I'm trying to re-write the Matcher object by incorporating a fuzzy similarity algorithm such as RapidFuzz but I'm having some difficulty with its Cython component.
The Matcher's Class call method finds all token sequences matching the supplied patterns on doclike, the document to match over or a Span (Type: Doc/Span), returning
a list of (match_id, start, end) tuples, describing the matches:
matches = find_matches (&self.patterns[0], self.patterns.size(), doclike, length,
extensions=self._extensions, predicates=self._extra_predicates)
for i, (key, start, end) in enumerate(matches):
on_match = self._callbacks.get(key, None)
if on_match is not None:
on_ma
return matches
find_matches is a cython class that returns the matches in a doc, with a compiled array of patterns as a list of (id, start, end) tuples and has main loop that seems to match the doc against the pre-defined patterns:
# Main loop
cdef int nr_predicate = len(predicates)
for i in range(length):
for j in range(n):
states.push_back(PatternStateC(patterns[j], i, 0))
transition_states(states, matches, predicate_cache,
doclike[i], extra_attr_values, predicates)
extra_attr_values += nr_extra_attr
predicate_cache += len(predicates)
Can you help me locate the actual matching operation (pattern against string) in the python/C-level objects as attributes? I hope to be able to extend this operation with the fuzzy matching algorithm. You can find the code for the Matcher class, the call method and the find_matches class here.
You can follow a more pythonic effort to achieve this goal by spaczz here.
I think the easiest way would be to add an additional predicate type called something like FUZZY. Look at how the regex, set, and comparison predicates are defined and do something similar for FUZZY with your custom code to match on strings with small edit differences:
https://github.com/explosion/spaCy/blob/master/spacy/matcher/matcher.pyx#L687-L781
The predicate classes are standard python classes, no cython required. You'll also need to add the predicate to the schema in spacy/matcher/_schemas.py.
Remember that like the rest of the Matcher predicates, it matches over tokens, so your definition of fuzziness will have to be at the token level.

What type of reasoning does Grakn support?

I attended a webinar and learned Grakn supports reasoning through rule based and type based reasoning:
person sub entity;
man sub person;
when {
$r1 (located: $x, locating: $y) isa locates;
$r2 (located: $y, locating: $z) isa locates;
},
then {
(located: $x, locating: $z) isa locates;
};
How does backward reasoning differ from forward chaining in this context?
It's easiest to see the difference from the kind of questions you can ask for forward and backward chaining.
Grakn, backward chaining
In Grakn, given this rule, if you query:
(1)
match (located: $x, locating: $z) isa locates; get;
Then Grakn will see from your query that there is a rule that can be used to infer this kind of fact. It then works backward to see if there are any results for the when of the rule. Simplifying things somewhat, it makes a query as per the when:
(2)
match
$r1 (located: $x, locating: $y) isa locates;
$r2 (located: $y, locating: $z) isa locates;
get;
if there are results then the then is inferred by Grakn and you get an answer to your original query (1).
Backward chaining answers the question, "Is this fact true?" Using inference to determine this.
Forward Chaining
Forward chaining answers a different question. It says, "I have these facts, what are all of the things that can be inferred from them?". You can use this to also answer the backward chaining question, however it will be much less efficient as forward chaining will infer unneeded facts.
A nice summary from Wikipedia's Forward Chaining article:
Because the data determines which rules are selected and used, this
method is called data-driven, in contrast to goal-driven backward
chaining inference.

OWLAPI slow calculating disjoint axioms

My goal: find the disjoint axioms (asserted and inferred) in an ontology which contains around 5000 axioms.
My code:
for (OWLClass clazz1 : ontology.getClassesInSignature()) {
for (OWLClass clazz2 : ontology.getClassesInSignature()) {
OWLAxiom axiom = MyModel.factory.getOWLDisjointClassesAxiom(clazz2, clazz1);
if( !(ontology.containsAxiom(axiom)) && reasoner.isEntailed(axiom))
{
System.out.println(clazz2.toString() + " disjoint with " + clazz1.toString());
}
The problem: the execution time is extremely slow, I'd say eternal. Even if I reduce the number of comparison with some if statement, the situation is still the same.
Protege seems to be very quick to compute those inferred axioms and it's based on the same API I am using (OWLAPI). So, am I in the wrong approach?
Profiling the code will very likely reveal that the slow part is
reasoner.isEntailed(axiom)
This form requires the reasoner to recompute entailments for each class pair, including the pairs where clazz1 and clazz2 are equal (you might want to skip that).
Alternatively, you can iterate through the classes in signature once and use the reasoner to get all disjoint classes:
Set<OWLClass> visited=new HashSet<>();
for (OWLClass c: ontology.getClassesInSignature()) {
if (visited.add(c)) {
NodeSet set = reasoner.getDisjointClasses(c);
for (Node node: set.getNodes()) {
System.out.println("Disjoint with "+c+": "+node);
visited.addAll(node.getEntities());
}
}
}
Worst case scenario, this will make one reasoner call per class (because no class is disjoint). Best case scenario, all classes are disjoint or equivalent to another class, so only one reasoner call is required.

swrlx:makeOWLThing is creating only one individual

Using Protege and SWRL tab, I have the ontology mentioned hereinafter. It is composed of the Class Test and the class Shadow, where Test has three individuals t1, t2, t3. I was trying to define an SWRL rule that creates an individual of Shadow class for each existing individual of Test, the rule is
Test(?x) ^ swrlx:makeOWLThing(?new, ?x) -> Shadow(?new)
QUESTIONS:
Only one individual of Shadow, named fred is created, instead of three (corresponding to t1, t2, t3).
How to control the naming of the resulting individual which is always named fred?
Prefix(:=<http://www.semanticweb.org/hilal/ontologies/2016/5/untitled- ontology-58#>)
Prefix(owl:=<http://www.w3.org/2002/07/owl#>)
Prefix(rdf:=<http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
Prefix(xml:=<http://www.w3.org/XML/1998/namespace>)
Prefix(xsd:=<http://www.w3.org/2001/XMLSchema#>)
Prefix(rdfs:=<http://www.w3.org/2000/01/rdf-schema#>)
Ontology(<http://www.semanticweb.org/hilal/ontologies/2016/5/untitled- ontology-58>
Declaration(Class(:Shadow))
Declaration(Class(:Test))
Declaration(NamedIndividual(:t1))
Declaration(NamedIndividual(:t2))
Declaration(NamedIndividual(:t3))
Declaration(AnnotationProperty(<http://swrl.stanford.edu/ontologies/3.3/swrla.owl#isRuleEnabled>))
############################
# Named Individuals
############################
# Individual: :t1 (:t1)
ClassAssertion(:Test :t1)
# Individual: :t2 (:t2)
ClassAssertion(:Test :t2)
# Individual: :t3 (:t3)
ClassAssertion(:Test :t3)
DLSafeRule(Annotation(<http://swrl.stanford.edu/ontologies/3.3/swrla.owl#isRuleEnabled> "true"^^xsd:boolean) Annotation(rdfs:comment ""^^xsd:string) Annotation(rdfs:label "S1"^^xsd:string) Body(BuiltInAtom(<http://swrl.stanford.edu/ontologies/built-ins/3.3/swrlx.owl#makeOWLThing> Variable(<new>) Variable(<x>)) ClassAtom(:Test Variable(<x>)))Head(ClassAtom(:Shadow Variable(<new>))))
)
SWRL rules cannot create new individuals, as far as I understand the DL Safe conditions.
In the comments, you linked to an article describing the semantics of that extension:
One of the the first built-ins I implemented provided the ability to create new individuals in a controlled manner. There is a detailed explanation in [2], but basically a built-in called swrlx:makeOWLThing creates a new individual and binds it to its first unbound argument; a new individual is created for each unique pattern of the remaining arguments.
Now, let's take a look at your rule as written in the question:
Test(?x) ^ swrlx:makeOWLThing(?new, ?x) -> Shadow(?new)
If the atoms are processed from left to right, then ?x should be bound when makeOWLThing is encountered, but ?new isn't. That means that you should get a new individuals bound to the variable ?new, and for each value of ?x you should get a different value of ?new. That's what it sounds like you want. However, in the code you posted, I see this:
DLSafeRule(
Annotation(<http://swrl.stanford.edu/ontologies/3.3/swrla.owl#isRuleEnabled> "true"^^xsd:boolean)
Annotation(rdfs:comment ""^^xsd:string)
Annotation(rdfs:label "S1"^^xsd:string)
Body(
BuiltInAtom(<http://swrl.stanford.edu/ontologies/built-ins/3.3/swrlx.owl#makeOWLThing>
Variable(<new>)
Variable(<x>))
ClassAtom(:Test Variable(<x>)))
Head(
ClassAtom(:Shadow Variable(<new>))))
)
I'm not certain, but if that's processed from left to right as well, the makeOWLThing(?new,?x) appears first, in which case ?x would be unbound when the new individual is created, so you'd only get one new individual.

Parent Object In Mongoid Embedded Relation Extensions

Given a simple embedded relationship with an extension like this:
class D
include Mongoid::Document
embeds_many :es do
def m
#...
end
end
end
class E
include Mongoid::Document
embedded_in :d
end
You can say things like this:
d = D.find(id)
d.es.m
Inside the extension's m method, how do access the specific d that we're working with?
I'm answering this myself for future reference. If anyone has an official and documented way of doing this, please let me know.
After an hour or so of Googling and reading (and re-reading) the Mongoid documentation, I turned to the Mongoid source code. A bit of searching and guesswork lead me to #base and its accessor method base:
embeds_many :es do
def m
base
end
end
and then you can say this:
d = D.find(id)
d.es.m.id == id # true
base is documented but the documentation is only there because it is defined using attr_reader :base and documentation generated from attr_reader calls isn't terribly useful. base also works with has_many associations.
How did I figure this out? The documentation on extensions mentions #target in an example:
embeds_many :addresses do
#...
def chinese
#target.select { |address| address.country == "China"}
end
end
#target isn't what we're looking for, #target is the array of embedded documents itself but we want what that array is inside of. A bit of grepping about for #target led me to #base (and the corresponding attr_reader :base calls) and a quick experiment verified that base is what I was looking for.

Resources