Clingo program expected to be satisfiable - logic-programming

I am testing some programs involving arithmetics in Clingo 5.0.0 and I don't understand why the below program is unsatisfiable:
#const v = 1.
a(object1).
a(object2).
b(object3).
value(object1,object2,object3) = "1.5".
value(X,Y,Z) > v, a(X), a(Y), b(Z), X!=Y :- go(X,Y,Z).
I expected an answer containing: a(object1) a(object2) b(object3) go(object1,object2,object3).
There is probably something I miss regarding arithmetic with Clingo.

I fear there are quite some misunderstandings about ASP here.
You can not assign values to predicates (value(a,b,c)=1.5). Predicates form atoms, that can be true or false (contained in an answer set or not).
I assume that your last rule shall derive the atom go(X,Y,Z). Rules do work the other way around, what is derived is on the left hand side.
There is no floating point arithmetic possible, you would have to scale your values up to integers.
Your problem might look like this, but this is just groping in the dark:
#const v = 1.
a(object1).
a(object2).
b(object3).
value(object1,object2,object3,2).
go(X,Y,Z) :- value(X,Y,Z,Value), Value > v, a(X), a(Y), b(Z), X!=Y.
The last rule states:
Derive go(object1,object2,object3) if value(object1,object2,object3,2) is true and 2 > 1 and a(object1) is true and a(object2) is true and b(object3) is true and object1 != object2.

Related

Theorem Solution by Resolution Refutation

I have the following problem which I need to solve by resolution method in Artificial Intelligence
I don't understand why the negation of dog(x) is added in the first clause and ///y in the fourth clause why negation of animal(Y) is added ...
I mean what is the need of negation there?
Recall that logical implication P → Q is equivalent to ¬P ∨ Q. You can verify this by looking at the truth table:
P Q P → Q
0 0 1
1 0 0
0 1 1
1 1 1
Now clearly dog(X) → animal(X) is equivalent to ¬dog(X) ∨ animal(X) which is a disjunction of literals therefore is a clause.
The same reasoning applies to animal(Y) → die(Y).
As soon as you have got a set of formulas in clausal form that is equivalent to your input knowledge base, you can apply binary resolution to check if your knowledge base is consistent, or to prove a goal.
To prove a goal you add a negation of it to your consistent knowledge base and see if the knowledge base with the added negation of the goal becomes inconsistent.

OWL; P max 1 Thing semantics

In OWL ontology, let's have:
P Domain A
P Range B
A subClassOf P max 1 Thing
Asking a DL query
(1) P max 1 Thing
will return A; OK
Asking
(2) P exactly 1 Thing or P exactly 0 Thing
will return A as well.
However; asking
(3) P exactly 1 Thing
will return Nothing. And asking
(4) P exactly 0 Thing
will return Nothing as well.
I thought that the union of (3) + (4) results is equivalent to the result of (2). Unfortunately, it's not! Why?
Because OWL semantics is not extensional. The "or" is not a set union. Based on your axioms, there is just no named class that is a subclass of (3) or (4).
In particular, when you ask DL Queries about classes, you ask queries about axioms that are entailed by your theory/ontology. They must be true in all possible interpretations of your theory. This includes (at least) one where all A's stand in P to exactly one other thing, one where all A's stand in P to exactly zero other things, and one in which there are no instances of A. DLQuerys will only return things that are true in all interpretations, and in some interpretations, the instances of A do not satisfy either (3) or (4).

What is the advantage of linspace over the colon ":" operator?

Is there some advantage of writing
t = linspace(0,20,21)
over
t = 0:1:20
?
I understand the former produces a vector, as the first does.
Can anyone state me some situation where linspace is useful over t = 0:1:20?
It's not just the usability. Though the documentation says:
The linspace function generates linearly spaced vectors. It is
similar to the colon operator :, but gives direct control over the
number of points.
it is the same, the main difference and advantage of linspace is that it generates a vector of integers with the desired length (or default 100) and scales it afterwards to the desired range. The : colon creates the vector directly by increments.
Imagine you need to define bin edges for a histogram. And especially you need the certain bin edge 0.35 to be exactly on it's right place:
edges = [0.05:0.10:.55];
X = edges == 0.35
edges = 0.0500 0.1500 0.2500 0.3500 0.4500 0.5500
X = 0 0 0 0 0 0
does not define the right bin edge, but:
edges = linspace(0.05,0.55,6); %// 6 = (0.55-0.05)/0.1+1
X = edges == 0.35
edges = 0.0500 0.1500 0.2500 0.3500 0.4500 0.5500
X = 0 0 0 1 0 0
does.
Well, it's basically a floating point issue. Which can be avoided by linspace, as a single division of an integer is not that delicate, like the cumulative sum of floting point numbers. But as Mark Dickinson pointed out in the comments:
You shouldn't rely on any of the computed values being exactly what you expect. That is not what linspace is for. In my opinion it's a matter of how likely you will get floating point issues and how much you can reduce the probabilty for them or how small can you set the tolerances. Using linspace can reduce the probability of occurance of these issues, it's not a security.
That's the code of linspace:
n1 = n-1
c = (d2 - d1).*(n1-1) % opposite signs may cause overflow
if isinf(c)
y = d1 + (d2/n1).*(0:n1) - (d1/n1).*(0:n1)
else
y = d1 + (0:n1).*(d2 - d1)/n1
end
To sum up: linspace and colon are reliable at doing different tasks. linspace tries to ensure (as the name suggests) linear spacing, whereas colon tries to ensure symmetry
In your special case, as you create a vector of integers, there is no advantage of linspace (apart from usability), but when it comes to floating point delicate tasks, there may is.
The answer of Sam Roberts provides some additional information and clarifies further things, including some statements of MathWorks regarding the colon operator.
linspace and the colon operator do different things.
linspace creates a vector of integers of the specified length, and then scales it down to the specified interval with a division. In this way it ensures that the output vector is as linearly spaced as possible.
The colon operator adds increments to the starting point, and subtracts decrements from the end point to reach a middle point. In this way, it ensures that the output vector is as symmetric as possible.
The two methods thus have different aims, and will often give very slightly different answers, e.g.
>> a = 0:pi/1000:10*pi;
>> b = linspace(0,10*pi,10001);
>> all(a==b)
ans =
0
>> max(a-b)
ans =
3.5527e-15
In practice, however, the differences will often have little impact unless you are interested in tiny numerical details. I find linspace more convenient when the number of gaps is easy to express, whereas I find the colon operator more convenient when the increment is easy to express.
See this MathWorks technical note for more detail on the algorithm behind the colon operator. For more detail on linspace, you can just type edit linspace to see exactly what it does.
linspace is useful where you know the number of elements you want rather than the size of the "step" between them. So if I said make a vector with 360 elements between 0 and 2*pi as a contrived example it's either going to be
linspace(0, 2*pi, 360)
or if you just had the colon operator you would have to manually calculate the step size:
0:(2*pi - 0)/(360-1):2*pi
linspace is just more convenient
For a simple real world application, see this answer where linspace is helpful in creating a custom colour map

How do I prevent a Datalog rule from pruning nulls?

I have the following facts and rules:
% frequents(D,P) % D=drinker, P=pub
% serves(P,B) % B=beer
% likes(D,B)
frequents(janus, godthaab).
frequents(janus, goldenekrone).
frequents(yanai, goldenekrone).
frequents(dimi, schlosskeller).
serves(godthaab, tuborg).
serves(godthaab, carlsberg).
serves(goldenekrone, pfungstaedter).
serves(schlosskeller, fix).
likes(janus, tuborg).
likes(janus, carlsberg).
count_good_beers_for_at(D,P,F) :- group_by((frequents(D,P), serves(P,B), likes(D,B)),[D,P],(F = count)).
possible_beers_served_for_at(D,P,B) :- lj(serves(P,B), frequents(D,R), P=R).
Now I would like to construct a rule that should work like a predicate returning "true" when the number of available "liked" beers at each pub that a "drinker" "frequents" is bigger than 0.
I would consider the predicate true when the rule returns no tuples. If the predicate is false, I was planning to make it return the bars not having a single "liked" beer.
As you can see, I already have a rule counting the good beers for a given drinker at a given pub. I also have a rule giving me the number of servable beers.
DES> count_good_beers_for_at(A,B,C)
{
count_good_beers_for_at(janus,godthaab,2)
}
Info: 1 tuple computed.
As you can see, the counter doesn't return the pubs frequented but having 0 liked beers. I was planning to work around this by using a left outer join.
DES> is_happy_at(D,P,Z) :- lj(serves(P,B), count_good_beers_for_at(D,Y,Z), (Y=P))
Info: Processing:
is_happy_at(D,P,Z) :-
lj(serves(P,B),count_good_beers_for_at(D,Y,Z),Y = P).
{
is_happy_at(janus,godthaab,2),
is_happy_at(null,goldenekrone,null),
is_happy_at(null,schlosskeller,null)
}
Info: 3 tuples computed.
This is almost right, except it is also giving me the pubs not frequented. I try adding an extra condition:
DES> is_happy_at(D,P,Z) :- lj(serves(P,B), count_good_beers_for_at(D,Y,Z), (Y=P)), frequents(D,P)
Info: Processing:
is_happy_at(D,P,Z) :-
lj(serves(P,B),count_good_beers_for_at(D,Y,Z),Y = P),
frequents(D,P).
{
is_happy_at(janus,godthaab,2)
}
Info: 1 tuple computed.
Now I somehow filtered everything containing nulls away! I suspect this is due to null-value logic in DES.
I recognize that I might be approaching this whole problem in a wrong way. Any help is appreciated.
EDIT: Assignment is "very_happy(D) ist wahr, genau dann wenn jede Bar, die Trinker D besucht, wenigstens ein Bier ausschenkt, das er mag." which translates to "very_happy(D) is true, iff each bar drinker D visits, serves at least 1 beer, that he likes". Since this assignment is about Datalog, I would think it is definitely possible to solve without using Prolog.
I think that for your assignement you should use basic Datalog, without abusing of aggregates. The point of the question is how to express universally quantified conditions. I googled for 'universal quantification datalog', and at first position I found deductnotes.pdf that asserts:
An universally quantified condition can only be expressed by an equivalent condition with existential quantification and negation.
In that PDF you will find also an useful example (pagg 9 & 10).
Thus we must rephrase our question. I ended up with this code:
not_happy(D) :-
frequents(D, P),
likes(D, B),
not(serves(P, B)).
very_happy(D) :-
likes(D, _),
not(not_happy(D)).
that seems what's required:
DES> very_happy(D)
{
}
Info: 0 tuple computed.
Note the likes(D, _), that's required to avoid that yanai and dimi get listed as very_happy, without explicit assertion of what them like (OT sorry my English really sucks...)
EDIT: I'm sorry, but the above solution doesn't work. I've rewritten it this way:
likes_pub(D, P) :-
likes(D, B),
serves(P, B).
unhappy(D) :-
frequents(D, P),
not(likes_pub(D, P)).
very_happy(D) :-
likes(D, _),
not(unhappy(D)).
test:
DES> unhappy(D)
{
unhappy(dimi),
unhappy(janus),
unhappy(yanai)
}
Info: 3 tuples computed.
DES> very_happy(D)
{
}
Info: 0 tuples computed.
Now we add a fact:
serves(goldenekrone, tuborg).
and we can see the corrected code outcome:
DES> unhappy(D)
{
unhappy(dimi),
unhappy(yanai)
}
Info: 2 tuples computed.
DES> very_happy(D)
{
very_happy(janus)
}
Info: 1 tuple computed.
Maybe not the answer your are expecting. But you can use ordinary Prolog and easily do group by queries with the bagof/3 or setof/3 builtin predicates.
?- bagof(B,(frequents(D,P), serves(P,B), likes(D,B)),L), length(L,N).
D = janus,
P = godthaab,
L = [tuborg,carlsberg],
N = 2
The semantics of bagof/3 is such that it does not compute an outer join for the given query. The query is normally executed by Prolog. The results are first accumulated and key sorted. Finally the results are then returned by backtracking. If your datalog cannot do without nulls, then yes you have to filter.
But you don't need to go into aggregates when you only want to know the existence of a liked beer. You can do it directly via a query without any aggregates:
is_happy_at(D,P) :- frequents(D,P), once((serves(P,B), likes(D,B))).
?- is_happy_at(D,P).
D = janus,
P = godthaab ;
Nein
The once/1 prevents from unnecessary backtrack. Datalog might either automatically not do unnecessary backtracking when it sees the projection in is_happy_at/2, i.e. B is projected away. Or you might need to explicitly use what corresponds to SQL DISTINCT. Or eventually your datalog provides you something that corresponds to SQL EXISTS which most closely corresponds to once/1.
Bye

Determine regular expression's specificity

Given the following regular expressions:
- alice#[a-z]+\.[a-z]+
- [a-z]+#[a-z]+\.[a-z]+
- .*
The string alice#myprovider.com will obviously match all three regular expressions. In the application I am developing, we are only interested in the 'most specific' match. In this case this is obviously the first one.
Unfortunately there seems no way to do this. We are using PCRE and I did not find a way to do this and a search on the Internet was also not fruitful.
A possible way would be to keep the regular expressions sorted on descending specificity and then simply take the first match. Of course then the next question would be how to sort the array of regular expressions. It is not an option to give the responsability to the end-user to ensure that the array is sorted.
So I hope you guys could help me out here...
Thanks !!
Paul
The following is the solution to this problem I developed based on Donald Miner's research paper, implemented in Python, for rules applied to MAC addresses.
Basically, the most specific match is from the pattern that is not a superset of any other matching pattern. For a particular problem domain, you create a series of tests (functions) which compare two REs and return which is the superset, or if they are orthogonal. This lets you build a tree of matches. For a particular input string, you go through the root patterns and find any matches. Then go through their subpatterns. If at any point, orthogonal patterns match, an error is raised.
Setup
import re
class RegexElement:
def __init__(self, string,index):
self.string=string
self.supersets = []
self.subsets = []
self.disjoints = []
self.intersects = []
self.maybes = []
self.precompilation = {}
self.compiled = re.compile(string,re.IGNORECASE)
self.index = index
SUPERSET = object()
SUBSET = object()
INTERSECT = object()
DISJOINT = object()
EQUAL = object()
The Tests
Each test takes 2 strings (a and b) and tries to determine how they are related. If the test cannot determine the relation, None is returned.
SUPERSET means a is a superset of b. All matches of b will match a.
SUBSET means b is a superset of a.
INTERSECT means some matches of a will match b, but some won't and some matches of b won't match a.
DISJOINT means no matches of a will match b.
EQUAL means all matches of a will match b and all matches of b will match a.
def equal_test(a, b):
if a == b: return EQUAL
The graph
class SubsetGraph(object):
def __init__(self, tests):
self.regexps = []
self.tests = tests
self._dirty = True
self._roots = None
#property
def roots(self):
if self._dirty:
r = self._roots = [i for i in self.regexps if not i.supersets]
return r
return self._roots
def add_regex(self, new_regex):
roots = self.roots
new_re = RegexElement(new_regex)
for element in roots:
self.process(new_re, element)
self.regexps.append(new_re)
def process(self, new_re, element):
relationship = self.compare(new_re, element)
if relationship:
getattr(self, 'add_' + relationship)(new_re, element)
def add_SUPERSET(self, new_re, element):
for i in element.subsets:
i.supersets.add(new_re)
new_re.subsets.add(i)
element.supersets.add(new_re)
new_re.subsets.add(element)
def add_SUBSET(self, new_re, element):
for i in element.subsets:
self.process(new_re, i)
element.subsets.add(new_re)
new_re.supersets.add(element)
def add_DISJOINT(self, new_re, element):
for i in element.subsets:
i.disjoints.add(new_re)
new_re.disjoints.add(i)
new_re.disjoints.add(element)
element.disjoints.add(new_re)
def add_INTERSECT(self, new_re, element):
for i in element.subsets:
self.process(new_re, i)
new_re.intersects.add(element)
element.intersects.add(new_re)
def add_EQUAL(self, new_re, element):
new_re.supersets = element.supersets.copy()
new_re.subsets = element.subsets.copy()
new_re.disjoints = element.disjoints.copy()
new_re.intersects = element.intersects.copy()
def compare(self, a, b):
for test in self.tests:
result = test(a.string, b.string)
if result:
return result
def match(self, text, strict=True):
matches = set()
self._match(text, self.roots, matches)
out = []
for e in matches:
for s in e.subsets:
if s in matches:
break
else:
out.append(e)
if strict and len(out) > 1:
for i in out:
print(i.string)
raise Exception("Multiple equally specific matches found for " + text)
return out
def _match(self, text, elements, matches):
new_elements = []
for element in elements:
m = element.compiled.match(text)
if m:
matches.add(element)
new_elements.extend(element.subsets)
if new_elements:
self._match(text, new_elements, matches)
Usage
graph = SubsetGraph([equal_test, test_2, test_3, ...])
graph.add_regex("00:11:22:..:..:..")
graph.add_regex("..(:..){5,5}"
graph.match("00:de:ad:be:ef:00")
A complete usable version is here.
My gut instinct says that not only is this a hard problem, both in terms of computational cost and implementation difficulty, but it may be unsolvable in any realistic fashion. Consider the two following regular expressions to accept the string alice#myprovider.com
alice#[a-z]+\.[a-z]+
[a-z]+#myprovider.com
Which one of these is more specific?
This is a bit of a hack, but it could provide a practical solution to this question asked nearly 10 years ago.
As pointed out by #torak, there are difficulties in defining what it means for one regular expression to be more specific than another.
My suggestion is to look at how stable the regular expression is with respect to a string that matches it. The usual way to investigate stability is to make minor changes to the inputs, and see if you still get the same result.
For example, the string alice#myprovider.com matches the regex /alice#myprovider\.com/, but if you make any change to the string, it will not match. So this regex is very unstable. But the regex /.*/ is very stable, because you can make any change to the string, and it still matches.
So, in looking for the most specific regex, we are looking for the least stable one with respect to a string that matches it.
In order to implement this test for stability, we need to define how we choose a minor change to the string that matches the regex. This is another can of worms. We could for example, choose to change each character of the string to something random and test that against the regex, or any number of other possible choices. For simplicity, I suggest deleting one character at a time from the string, and testing that.
So, if the string that matches is N characters long, we have N tests to make. Lets's look at deleting one character at a time from the string alice#foo.com, which matches all of the regular expressions in the table below. It's 12 characters long, so there are 12 tests. In the table below,
0 means the regex does not match (unstable),
1 means it matches (stable)
/alice#[a-z]+\.[a-z]+/ /[a-z]+#[a-z]+\.[a-z]+/ /.*/
lice#foo.com 0 1 1
aice#foo.com 0 1 1
alce#foo.com 0 1 1
alie#foo.com 0 1 1
alic#foo.com 0 1 1
alicefoo.com 0 0 1
alice#oo.com 1 1 1
alice#fo.com 1 1 1
alice#fo.com 1 1 1
alice#foocom 0 0 1
alice#foo.om 1 1 1
alice#foo.cm 1 1 1
--- --- ---
total score: 5 10 12
The regex with the lowest score is the most specific. Of course, in general, there may be more than one regex with the same score, which reflects the fact there are regular expressions which by any reasonable way of measuring specificity are as specific as one another. Although it may also yield the same score for regular expressions that one can easily argue are not as specific as each other (if you can think of an example, please comment).
But coming back to the question asked by #torak, which of these is more specific:
alice#[a-z]+\.[a-z]+
[a-z]+#myprovider.com
We could argue that the second is more specific because it constrains more characters, and the above test will agree with that view.
As I said, the way we choose to make minor changes to the string that matches more than one regex is a can of worms, and the answer that the above method yields may depend on that choice. But as I said, this is an easily implementable hack - it is not rigourous.
And, of course the method breaks if the string that matches is empty. The usefulness if the test will increase as the length of the string increases. With very short strings, it is more likely produce equal scores for regular expressions that are clearly different in their specificity.
I'm thinking of a similar problem for a PHP projects route parser. After reading the other answers and comments here, and also thinking about the cost involved I might go in another direction altogether.
A solution however, would be to simply sort the regular expression list in order of it's string length.
It's not perfect, but simply by removing the []-groups it would be much closer. On the first example in the question it would this list:
- alice#[a-z]+\.[a-z]+
- [a-z]+#[a-z]+\.[a-z]+
- .*
To this, after removing content of any []-group:
- alice#+\.+
- +#+\.+
- .*
Same thing goes for the second example in another answer, with the []-groups completely removed and sorted by length, this:
alice#[a-z]+\.[a-z]+
[a-z]+#myprovider.com
Would become sorted as:
+#myprovider.com
alice#+\.+
This is a good enough solution at least for me, if I choose to use it. Downside would be the overhead of removing all groups of [] before sorting and applying the sort on the unmodified list of regexes, but hey - you can't get everything.

Resources